Wavelet domain watermarks

ABSTRACT

A wavelet domain watermark encoder and decoder embed and detect auxiliary signals ( 1300, 1302 ) in a media signal ( 1306 ), such as a still image, video or audio signal. A watermark orientation signal ( 1302 ) is embedded in a wavelet decomposed signal ( 1304 ) to facilitate detection of the watermark in a geometrically distorted version of the embedded signal.

RELATED APPLICATION DATA

[0001] This application is a continuation in part of U.S. patentapplication Ser. No. 09/503,881, filed Feb. 14, 2000.

TECHNICAL FIELD

[0002] The invention relates to digital watermarks, and specificallyrelates to encoding and decoding auxiliary information in media signals,such as images, audio and video.

BACKGROUND AND SUMMARY

[0003] Digital watermarking is a process for modifying physical orelectronic media to embed a machine-readable code into the media. Themedia may be modified such that the embedded code is imperceptible ornearly imperceptible to the user, yet may be detected through anautomated detection process. Most commonly, digital watermarking isapplied to media signals such as images, audio signals, and videosignals. However, it may also be applied to other types of mediaobjects, including documents (e.g., through line, word or charactershifting), software, multi-dimensional graphics models, and surfacetextures of objects.

[0004] Digital watermarking systems typically have two primarycomponents: an encoder that embeds the watermark in a host media signal,and a decoder that detects and reads the embedded watermark from asignal suspected of containing a watermark (a suspect signal). Theencoder embeds a watermark by altering the host media signal. Thereading component analyzes a suspect signal to detect whether awatermark is present. In applications where the watermark encodesinformation, the reader extracts this information from the detectedwatermark.

[0005] Several particular watermarking techniques have been developed.The reader is presumed to be familiar with the literature in this field.Particular techniques for embedding and detecting imperceptiblewatermarks in media signals are detailed in the present assignee'sco-pending application Ser. No. 9/503,881.

[0006] The invention relates to encoding and decoding auxiliary signalsin a media signal, such as a still image, video or audio signal, using awavelet or subband decomposition of the signal. One aspect of theinvention is a method of embedding an auxiliary signal into a mediasignal so that the auxiliary signal is substantially imperceptible. Thismethod performs a wavelet decomposition of the media signal, and embedsa watermark orientation signal into the wavelet decomposition. Theauxiliary signal includes a signal used to determine orientation of theauxiliary signal called the watermark orientation signal. Thisorientation signal has attributes used to determine orientation of theauxiliary signal in a geometrically distorted version of the mediasignal. The orientation signal may carry a message comprising one ormore symbols (e.g., binary or M-ary symbols) of information.Alternatively, a separate watermark message signal may carry themessage. When embedded as separate signals, the orientation and messagesignal components of the watermark may be orthogonal to one another. Ina wavelet decomposition of the host signal, the watermark embedder mayinsert the message and orientation signal components in separatesubbands.

[0007] Another aspect of the invention is a method of detecting anauxiliary signal embedded in a media signal, where the auxiliaryinformation is substantially imperceptible in an output form of themedia signal. This method performs a wavelet decomposition of the mediasignal into two or more levels of resolution. It correlates a referencewatermark orientation signal with the wavelet decomposition of the mediasignal to determine orientation of the auxiliary signal in the mediasignal.

[0008] Another aspect of the invention is an alternative method ofencoding an auxiliary signal in a media signal. This method performs awavelet decomposition of the media signal into two or more levels ofresolution, including an approximate level and one or more higherresolution levels. It then modifies the approximate level to encode anauxiliary signal such that the modification is substantiallyimperceptible in an output form of the media signal.

[0009] Another aspect of the invention is a method of detecting anauxiliary signal embedded in a media signal, where the auxiliaryinformation is substantially imperceptible in an output form of themedia signal. This method performs a wavelet decomposition of the mediasignal into two or more levels of resolution, including an approximatelevel and one or more higher resolution levels, and detects theauxiliary information from the approximate level.

[0010] Additional features of the invention will become apparent withreference to the following detailed description and accompanyingdrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011]FIG. 1 is a block diagram illustrating an image watermark system.

[0012]FIG. 2 is a block diagram illustrating an image watermarkembedder.

[0013]FIG. 3 is a spatial frequency domain plot of a detection watermarksignal.

[0014]FIG. 4 is a flow diagram of a process for detecting a watermarksignal in an image and computing its orientation within the image.

[0015]FIG. 5 is a flow diagram of a process reading a message encoded ina watermark.

[0016]FIG. 6 is a diagram depicting an example of a watermark detectionprocess.

[0017]FIG. 7 is a diagram depicting the orientation of a transformedimage superimposed over the original orientation of the image at thetime of watermark encoding.

[0018]FIG. 8 is a diagram illustrating an implementation of a watermarkembedder.

[0019]FIG. 9 is a diagram depicting an assignment map used to map rawbits in a message to locations within a host image.

[0020]FIG. 10 illustrates an example of a watermark orientation signalin a spatial frequency domain.

[0021]FIG. 11 illustrates the orientation signal shown in FIG. 10 in thespatial domain.

[0022]FIG. 12 is a diagram illustrating an overview of a watermarkdetector implementation.

[0023]FIG. 13 is a diagram illustrating an implementation of thedetector pre-processor depicted generally in FIG. 12.

[0024]FIG. 14 is a diagram illustrating a process for estimatingrotation and scale vectors of a detection watermark signal.

[0025]FIG. 15 is a diagram illustrating a process for refining therotation and scale vectors, and for estimating differential scaleparameters of the detection watermark signal.

[0026]FIG. 16 is a diagram illustrating a process for aggregatingevidence of the orientation signal and orientation parameter candidatesfrom two or more frames.

[0027]FIG. 17 is a diagram illustrating a process for estimatingtranslation parameters of the detection watermark signal.

[0028]FIG. 18 is a diagram illustrating a process for refiningorientation parameters using known message bits in the watermarkmessage.

[0029]FIG. 19 is a diagram illustrating a process for reading awatermark message from an image, after re-orienting the image data usingan orientation vector.

[0030]FIG. 20 is a diagram of a computer system that serves as anoperating environment for software implementations of a watermarkembedder, detector and reader.

[0031]FIG. 21 is a block diagram of a wavelet based watermark embeddingprocess.

[0032]FIG. 22 is a block diagram of a watermark decoding process for awavelet domain watermark.

[0033]FIG. 23 is a diagram depicting a wavelet decomposition.

[0034]FIG. 24 is a diagram depicting a process for multi-level detectionof a watermark.

[0035]FIG. 25 is a flow diagram depicting a correlation process fordetecting a watermark and determining its orienation.

DETAILED DESCRIPTION

[0036] 1.0 Introduction

[0037] A watermark can be viewed as an information signal that isembedded in a host signal, such as an image, audio, or some other mediacontent. Watermarking systems based on the following detaileddescription may include the following components: 1) An embedder thatinserts a watermark signal in the host signal to form a combined signal;2) A detector that determines the presence and orientation of awatermark in a potentially corrupted version of the combined signal; and3) A reader that extracts a watermark message from the combined signal.In some implementations, the detector and reader are combined.

[0038] The structure and complexity of the watermark signal can varysignificantly, depending on the application. For example, the watermarkmay be comprised of one or more signal components, each defined in thesame or different domains. Each component may perform one or morefunctions. Two primary functions include acting as an identifier tofacilitate detection and acting as an information carrier to convey amessage. In addition, components may be located in different spatial ortemporal portions of the host signal, and may carry the same ordifferent messages.

[0039] The host signal can vary as well. The host is typically some formof multi-dimensional media signal, such as an image, audio sequence orvideo sequence. In the digital domain, each of these media types isrepresented as a multi-dimensional array of discrete samples. Forexample, a color image has spatial dimensions (e.g., its horizontal andvertical components), and color space dimensions (e.g., YUV or RGB).Some signals, like video, have spatial and temporal dimensions.Depending on the needs of a particular application, the embedder mayinsert a watermark signal that exists in one or more of thesedimensions.

[0040] In the design of the watermark and its components, developers arefaced with several design issues such as: the extent to which the markis impervious to jamming and manipulation (either intentional orunintentional); the extent of imperceptibility; the quantity ofinformation content; the extent to which the mark facilitates detectionand recovery, and the extent to which the information content can berecovered accurately.

[0041] For certain applications, such as copy protection orauthentication, the watermark should be difficult to tamper with orremove by those seeking to circumvent it. To be robust, the watermarkmust withstand routine manipulation, such as data compression, copying,linear transformation, flipping, inversion, etc., and intentionalmanipulation intended to remove the mark or make it undetectable. Someapplications require the watermark signal to remain robust throughdigital to analog conversion (e.g., printing an image or playing music),and analog to digital conversion (e.g., scanning the image or digitallysampling the music). In some cases, it is beneficial for thewatermarking technique to withstand repeated watermarking.

[0042] A variety of signal processing techniques may be applied toaddress some or all of these design considerations. One such techniqueis referred to as spreading. Sometimes categorized as a spread spectrumtechnique, spreading is a way to distribute a message into a number ofcomponents (chips), which together make up the entire message. Spreadingmakes the mark more impervious to jamming and manipulation, and makes itless perceptible.

[0043] Another category of signal processing technique is errorcorrection and detection coding. Error correction coding is useful toreconstruct the message accurately from the watermark signal. Errordetection coding enables the decoder to determine when the extractedmessage has an error.

[0044] Another signal processing technique that is useful in watermarkcoding is called scattering. Scattering is a method of distributing themessage or its components among an array of locations in a particulartransform domain, such as a spatial domain or a spatial frequencydomain. Like spreading, scattering makes the watermark less perceptibleand more impervious to manipulation.

[0045] Yet another signal processing technique is gain control. Gaincontrol is used to adjust the intensity of the watermark signal. Theintensity of the signal impacts a number of aspects of watermark coding,including its perceptibility to the ordinary observer, and the abilityto detect the mark and accurately recover the message from it.

[0046] Gain control can impact the various functions and components ofthe watermark differently. Thus, in some cases, it is useful to controlthe gain while taking into account its impact on the message andorientation functions of the watermark or its components. For example,in a watermark system described below, the embedder calculates adifferent gain for orientation and message components of an imagewatermark.

[0047] Another useful tool in watermark embedding and reading isperceptual analysis. Perceptual analysis refers generally to techniquesfor evaluating signal properties based on the extent to which thoseproperties are (or are likely to be) perceptible to humans (e.g.,listeners or viewers of the media content). A watermark embedder cantake advantage of a Human Visual System (HVS) model to determine whereto place a watermark and how to control the intensity of the watermarkso that chances of accurately recovering the watermark are enhanced,resistance to tampering is increased, and perceptibility of thewatermark is reduced. Such perceptual analysis can play an integral rolein gain control because it helps indicate how the gain can be adjustedrelative to the impact on the perceptibility of the mark. Perceptualanalysis can also play an integral role in locating the watermark in ahost signal. For example, one might design the embedder to hide awatermark in portions of a host signal that are more likely to mask themark from human perception.

[0048] Various forms of statistical analyses may be performed on asignal to identify places to locate the watermark, and to identifyplaces where to extract the watermark. For example, a statisticalanalysis can identify portions of a host image that have noise-likeproperties that are likely to make recovery of the watermark signaldifficult. Similarly, statistical analyses may be used to characterizethe host signal to determine where to locate the watermark.

[0049] Each of the techniques may be used alone, in variouscombinations, and in combination with other signal processingtechniques.

[0050] In addition to selecting the appropriate signal processingtechniques, the developer is faced with other design considerations. Oneconsideration is the nature and format of the media content. In the caseof digital images, for example, the image data is typically representedas an array of image samples. Color images are represented as an arrayof color vectors in a color space, such as RGB or YUV. The watermark maybe embedded in one or more of the color components of an image. In someimplementations, the embedder may transform the input image into atarget color space, and then proceed with the embedding process in thatcolor space.

[0051] 2.0 Digital Watermark Embedder and Reader Overview

[0052] The following sections describe implementations of a watermarkembedder and reader that operate on digital signals. The embedderencodes a message into a digital signal by modifying its sample valuessuch that the message is imperceptible to the ordinary observer inoutput form. To extract the message, the reader captures arepresentation of the signal suspected of containing a watermark andthen processes it to detect the watermark and decode the message.

[0053]FIG. 1 is a block diagram summarizing signal processing operationsinvolved in embedding and reading a watermark. There are three primaryinputs to the embedding process: the original, digitized signal 100, themessage 102, and a series of control parameters 104. The controlparameters may include one or more keys. One key or set of keys may beused to encrypt the message. Another key or set of keys may be used tocontrol the generation of a watermark carrier signal or a mapping ofinformation bits in the message to positions in a watermark informationsignal.

[0054] The carrier signal or mapping of the message to the host signalmay be encrypted as well. Such encryption may increase security byvarying the carrier or mapping for different components of the watermarkor watermark message. Similarly, if the watermark or watermark messageis redundantly encoded throughout the host signal, one or moreencryption keys can be used to scramble the carrier or signal mappingfor each instance of the redundantly encoded watermark. This use ofencryption provides one way to vary the encoding of each instance of theredundantly encoded message in the host signal. Other parameters mayinclude control bits added to the message, and watermark signalattributes (e.g., orientation or other detection patterns) used toassist in the detection of the watermark.

[0055] Apart from encrypting or scrambling the carrier and mappinginformation, the embedder may apply different, and possibly uniquecarrier or mapping for different components of a message, for differentmessages, or from different watermarks or watermark components to beembedded in the host signal. For example, one watermark may be encodedin a block of samples with one carrier, while another, possiblydifferent watermark, is encoded in a different block with a differentcarrier. A similar approached to use different mappings in differentblocks of the host signal.

[0056] The watermark embedding process 106 converts the message to awatermark information signal. It then combines this signal with theinput signal and possibly another signal (e.g., an orientation pattern)to create a watermarked signal 108. The process of combining thewatermark with the input signal may be a linear or non-linear function.Examples of watermarking functions include: S*=S+gX; S*=S(1+gX); andS*=S e^(gX); where S* is the watermarked signal vector, S is the inputsignal vector, and g is a function controlling watermark intensity. Thewatermark may be applied by modulating signal samples S in the spatial,temporal or some other transform domain.

[0057] To encode a message, the watermark encoder analyzes andselectively adjusts the host signal to give it attributes thatcorrespond to the desired message symbol or symbols to be encoded. Thereare many signal attributes that may encode a message symbol, such as apositive or negative polarity of signal samples or a set of samples, agiven parity (odd or even), a given difference value or polarity of thedifference between signal samples (e.g., a difference between selectedspatial intensity values or transform coefficients), a given distancevalue between watermarks, a given phase or phase offset betweendifferent watermark components, a modulation of the phase of the hostsignal, a modulation of frequency coefficients of the host signal, agiven frequency pattern, a given quantizer (e.g., in Quantization IndexModulation) etc.

[0058] Some processes for combining the watermark with the input signalare termed non-linear, such as processes that employ dither modulation,modify least significant bits, or apply quantization index modulation.One type of non-linear modulation is where the embedder sets signalvalues so that they have some desired value or characteristiccorresponding to a message symbol. For example, the embedder maydesignate that a portion of the host signal is to encode a given bitvalue. It then evaluates a signal value or set of values in that portionto determine whether they have the attribute corresponding to themessage bit to be encoded. Some examples of attributes include apositive or negative polarity, a value that is odd or even, a checksum,etc. For example, a bit value may be encoded as a one or zero byquantizing the value of a selected sample to be even or odd. As anotherexample, the embedder might compute a checksum or parity of an N bitpixel value or transform coefficient and then set the least significantbit to the value of the checksum or parity. Of course, if the signalalready corresponds to the desired message bit value, it need not bealtered. The same approach can be extended to a set of signal sampleswhere some attribute of the set is adjusted as necessary to encode adesired message symbol. These techniques can be applied to signalsamples in a transform domain (e.g., transform coefficients) or samplesin the temporal or spatial domains.

[0059] Quantization index modulation techniques employ a set ofquantizers. In these techniques, the message to be transmitted is usedas an index for quantizer selection. In the decoding process, a distancemetric is evaluated for all quantizers and the index with the smallestdistance identifies the message value.

[0060] The watermark detector 110 operates on a digitized signalsuspected of containing a watermark. As depicted generally in FIG. 1,the suspect signal may undergo various transformations 112, such asconversion to and from an analog domain, cropping, copying, editing,compression/decompression, transmission etc. Using parameters 114 fromthe embedder (e.g., orientation pattern, control bits, key(s)), itperforms a series of correlation or other operations on the capturedimage to detect the presence of a watermark. If it finds a watermark, itdetermines its orientation within the suspect signal.

[0061] Using the orientation, if necessary, the reader 116 extracts themessage. Some implementations do not perform correlation, but instead,use some other detection process or proceed directly to extract thewatermark signal. For instance in some applications, a reader may beinvoked one or more times at various temporal or spatial locations in anattempt to read the watermark, without a separate pre-processing stageto detect the watermark's orientation.

[0062] Some implementations require the original, un-watermarked signalto decode a watermark message, while others do not. In those approacheswhere the original signal is not necessary, the original un-watermarkedsignal can still be used to improve the accuracy of message recovery.For example, the original signal can be removed, leaving a residualsignal from which the watermark message is recovered. If the decoderdoes not have the original signal, it can still attempt to removeportions of it (e.g., by filtering) that are expected not to contain thewatermark signal.

[0063] Watermark decoder implementations use known relationships betweena watermark signal and a message symbol to extract estimates of messagesymbol values from a signal suspected of containing a watermark. Thedecoder has knowledge of the properties of message symbols and how andwhere they are encoded into the host signal to encode a message. Forexample, it knows how message bit values of one and a zero are encodedand it knows where these message bits are originally encoded. Based onthis information, it can look for the message properties in thewatermarked signal. For example, it can test the watermarked signal tosee if it has attributes of each message symbol (e.g., a one or zero) ata particular location and generate a probability measure as an indicatorof the likelihood that a message symbol has been encoded. Knowing theapproximate location of the watermark in the watermarked signal, thereader implementation may compare known message properties with theproperties of the watermarked signal to estimate message values, even ifthe original signal is unavailable. Distortions to the watermarkedsignal and the host signal itself make the watermark difficult torecover, but accurate recovery of the message can be enhanced using avariety of techniques, such as error correction coding, watermark signalprediction, redundant message encoding, etc.

[0064] One way to recover a message value from a watermarked signal isto perform correlation between the known message property of eachmessage symbol and the watermarked signal. If the amount of correlationexceeds a threshold, for example, then the watermarked signal may beassumed to contain the message symbol. The same process can be repeatedfor different symbols at various locations to extract a message. Asymbol (e.g., a binary value of one or zero) or set of symbols may beencoded redundantly to enhance message recovery.

[0065] In some cases, it is useful to filter the watermarked signal toremove aspects of the signal that are unlikely to be helpful inrecovering the message and/or are likely to interfere with the watermarkmessage. For example, the decoder can filter out portions of theoriginal signal and another watermark signal or signals. In addition,when the original signal is unavailable, the reader can estimate orpredict the original signal based on properties of the watermarkedsignal. The original or predicted version of the original signal canthen be used to recover an estimate of the watermark message. One way touse the predicted version to recover the watermark is to remove thepredicted version before reading the desired watermark. Similarly, thedecoder can predict and remove un-wanted watermarks or watermarkcomponents before reading the desired watermark in a signal having twoor more watermarks.

[0066] 2.1 Image Watermark Embedder

[0067]FIG. 2 is a block diagram illustrating an implementation of anexemplary embedder in more detail. The embedding process begins with themessage 200. As noted above, the message is binary number suitable forconversion to a watermark signal. For additional security, the message,its carrier, and the mapping of the watermark to the host signal may beencrypted with an encryption key 202. In addition to the informationconveyed in the message, the embedder may also add control bit values(“signature bits”) to the message to assist in verifying the accuracy ofa read operation. These control bits, along with the bits representingthe message, are input to an error correction coding process 204designed to increase the likelihood that the message can be recoveredaccurately in the reader.

[0068] There are several alternative error correction coding schemesthat may be employed.

[0069] Some examples include BCH, convolution, Reed Solomon and turbocodes. These forms of error correction coding are sometimes used incommunication applications where data is encoded in a carrier signalthat transfers the encoded data from one place to another. In thedigital watermarking application discussed here, the raw bit data isencoded in a fundamental carrier signal.

[0070] In addition to the error correction coding schemes mentionedabove, the embedder and reader may also use a Cyclic Redundancy Check(CRC) to facilitate detection of errors in the decoded message data.

[0071] The error correction coding function 204 produces a string ofbits, termed raw bits 206, that are embedded into a watermarkinformation signal. Using a carrier signal 208 and an assignment map210, the illustrated embedder encodes the raw bits in a watermarkinformation signal 212, 214. In some applications, the embedder mayencode a different message in different locations of the signal. Thecarrier signal may be a noise image. For each raw bit, the assignmentmap specifies the corresponding image sample or samples that will bemodified to encode that bit.

[0072] The embedder depicted in FIG. 2 operates on blocks of image data(referred to as ‘tiles’) and replicates a watermark in each of theseblocks. As such, the carrier signal and assignment map both correspondto an image block of a pre-determined size, namely, the size of thetile. To encode each bit, the embedder applies the assignment map todetermine the corresponding image samples in the block to be modified toencode that bit. Using the map, it finds the corresponding image samplesin the carrier signal. For each bit, the embedder computes the value ofimage samples in the watermark information signal as a function of theraw bit value and the value(s) of the corresponding samples in thecarrier signal.

[0073] To illustrate the embedding process further, it is helpful toconsider an example. First, consider the following background. Digitalwatermarking processes are sometimes described in terms of the transformdomain in which the watermark signal is defined. The watermark may bedefined in the spatial or temporal domain, or some other transformdomain such as a wavelet transform, Discrete Cosine Transform (DCT),Discrete Fourier Transform (DFT), Hadamard transform, Hartley transform,Karhunen-Loeve transform (KLT) domain, etc.

[0074] Consider an example where the watermark is defined in a transformdomain (e.g., a frequency domain such as DCT, wavelet or DFT). Theembedder segments the image in the spatial domain into rectangular tilesand transforms the image samples in each tile into the transform domain.For example in the DCT domain, the embedder segments the image into N byN blocks and transforms each block into an N by N block of DCTcoefficients. In this example, the assignment map specifies thecorresponding sample location or locations in the frequency domain ofthe tile that correspond to a bit position in the raw bits. In thefrequency domain, the carrier signal looks like a noise pattern. Eachimage sample in the frequency domain of the carrier signal is usedtogether with a selected raw bit value to compute the value of the imagesample at the location in the watermark information signal.

[0075] Now consider an example where the watermark is defined in thespatial domain. The embedder segments the image in the spatial domaininto rectangular tiles of image samples (i.e. pixels). In this example,the assignment map specifies the corresponding sample location orlocations in the tile that correspond to each bit position in the rawbits. In the spatial domain, the carrier signal looks like a noisepattern extending throughout the tile. Each image sample in the spatialdomain of the carrier signal is used together with a selected raw bitvalue to compute the value of the image sample at the same location inthe watermark information signal.

[0076] With this background, the embedder proceeds to encode each rawbit in the selected transform domain as follows. It uses the assignmentmap to look up the position of the corresponding image sample (orsamples) in the carrier signal. The image sample value at that positionin the carrier controls the value of the corresponding position in thewatermark information signal. In particular, the carrier sample valueindicates whether to invert the corresponding watermark sample value.The raw bit value is either a one or zero. Disregarding for a moment theimpact of the carrier signal, the embedder adjusts the correspondingwatermark sample upward to represent a one, or downward to represent azero. Now, if the carrier signal indicates that the corresponding sampleshould be inverted, the embedder adjusts the watermark sample downwardto represent a one, and upward to represent a zero. In this manner, theembedder computes the value of the watermark samples for a raw bit usingthe assignment map to find the spatial location of those samples withinthe block.

[0077] From this example, a number of points can be made. First, theembedder may perform a similar approach in any other transform domain.Second, for each raw bit, the corresponding watermark sample or samplesare some function of the raw bit value and the carrier signal value. Thespecific mathematical relationship between the watermark sample, on onehand, and the raw bit value and carrier signal, on the other, may varywith the implementation. For example, the message may be convolved withthe carrier, multiplied with the carrier, added to the carrier, orapplied based on another non4inear function. Third, the carrier signalmay remain constant for a particular application, or it may vary fromone message to another. For example, a secret key may be used togenerate the carrier signal. For each raw bit, the assignment map maydefine a pattern of watermark samples in the transform domain in whichthe watermark is defined. An assignment map that maps a raw bit to asample location or set of locations (i.e. a map to locations in afrequency or spatial domain) is just one special case of an assignmentmap for a transform domain. Fourth, the assignment map may remainconstant, or it may vary from one message to another. In addition, thecarrier signal and map may vary depending on the nature of theunderlying image. In sum, there many possible design choices within theimplementation framework described above.

[0078] The embedder depicted in FIG. 2 combines another watermarkcomponent, shown as the detection watermark 216, with the watermarkinformation signal to compute the final watermark signal. The detectionwatermark is specifically chosen to assist in identifying the watermarkand computing its orientation in a detection operation.

[0079]FIG. 3 is a spatial frequency plot illustrating one quadrant of adetection watermark. The points in the plot represent impulse functionsindicating signal content of the detection watermark signal. The patternof impulse functions for the illustrated quadrant is replicated in allfour quadrants. There are a number of properties of the detectionpattern that impact its effectiveness for a particular application. Theselection of these properties is highly dependent on the application.One property is the extent to which the pattern is symmetric about oneor more axes. For example, if the detection pattern is symmetrical aboutthe horizontal and vertical axes, it is referred to as being quadsymmetric. If it is further symmetrical about diagonal axes at an angleof 45 degrees, it is referred to as being octally symmetric (repeated ina symmetric pattern 8 times about the origin). Such symmetry aids inidentifying the watermark in an image, and aids in extracting therotation angle. However, in the case of an octally symmetric pattern,the detector includes an additional step of testing which of the fourquadrants the orientation angle falls into.

[0080] Another criterion is the position of the impulse functions andthe frequency range that they reside in. Preferably, the impulsefunctions fall in a mid frequency range. If they are located in a lowfrequency range, they may be noticeable in the watermarked image. Ifthey are located in the high frequency range, they are more difficult torecover. Also, they should be selected so that scaling, rotation, andother manipulations of the watermarked signal do not push the impulsefunctions outside the range of the detector. Finally, the impulsefunctions should preferably not fall on the vertical or horizontal axes,and each impulse function should have a unique horizontal and verticallocation. While the example depicted in FIG. 3 shows that some of theimpulse functions fall on the same horizontal axis, it is trivial toalter the position of the impulse functions such that each has a uniquevertical or horizontal coordinate.

[0081] Returning to FIG. 2, the embedder makes a perceptual analysis 218of the input image 220 to identify portions of the image that canwithstand more watermark signal content without substantially impactingimage fidelity. Generally, the perceptual analysis employs a HVS modelto identify signal frequency bands and/or spatial areas to increase ordecrease watermark signal intensity to make the watermark imperceptibleto an ordinary observer. One type of model is to increase watermarkintensity in frequency bands and spatial areas where there is more imageactivity. In these areas, the sample values are changing more than otherareas and have more signal strength. The output of the perceptualanalysis is a perceptual mask 222. The mask may be implemented as anarray of functions, which selectively increase the signal strength ofthe watermark signal based on a HVS model analysis of the input image.The mask may selectively increase or decrease the signal strength of thewatermark signal in areas of greater signal activity.

[0082] The embedder combines (224) the watermark information, thedetection signal and the perceptual mask to yield the watermark signal226. Finally, it combines (228) the input image 220 and the watermarksignal 226 to create the watermarked image 230. In the frequency domainwatermark example above, the embedder combines the transform domaincoefficients in the watermark signal to the corresponding coefficientsin the input image to create a frequency domain representation of thewatermarked image. It then transforms the image into the spatial domain.As an alternative, the embedder may be designed to convert the watermarkinto the spatial domain, and then add it to the image.

[0083] In the spatial watermark example above, the embedder combines theimage samples in the watermark signal to the corresponding samples inthe input image to create the watermarked image 230.

[0084] The embedder may employ an invertible or non-invertible, andlinear or non-linear function to combine the watermark signal and theinput image (e.g., linear functions such as S*=S+gX; or S*=S(1+gX),convolution, quantization index modulation). The net effect is that someimage samples in the input image are adjusted upward, while others areadjusted downward. The extent of the adjustment is greater in areas orsubbands of the image having greater signal activity.

[0085] 2.2. Overview of a Detector and Reader

[0086]FIG. 4 is a flow diagram illustrating an overview of a watermarkdetection process. This process analyzes image data 400 to search for anorientation pattern of a watermark in an image suspected of containingthe watermark (the target image). First, the detector transforms theimage data to another domain 402, namely the spatial frequency domain,and then performs a series of correlation or other detection operations404. The correlation operations match the orientation pattern with thetarget image data to detect the presence of the watermark and itsorientation parameters 406 (e.g., translation, scale, rotation, anddifferential scale relative to its original orientation). Finally, itre-orients the image data based on one or more of the orientationparameters 408.

[0087] If the orientation of the watermark is recovered, the readerextracts the watermark information signal from the image data(optionally by first re-orienting the data based on the orientationparameters). FIG. 5 is flow diagram illustrating a process of extractinga message from re-oriented image data 500. The reader scans the imagesamples (e.g., pixels or transform domain coefficients) of there-oriented image (502), and uses known attributes of the watermarksignal to estimate watermark signal values 504. Recall that in oneexample implementation described above, the embedder adjusted samplevalues (e.g., frequency coefficients, color values, etc.) up or down toembed a watermark information signal. The reader uses this attribute ofthe watermark information signal to estimate its value from the targetimage. Prior to making these estimates, the reader may filter the imageto remove portions of the image signal that may interfere with theestimating process. For example, if the watermark signal is expected toreside in low or medium frequency bands, then high frequencies may befiltered out.

[0088] In addition, the reader may predict the value of the originalun-watermarked image to enhance message recovery. One form of predictionuses temporal or spatial neighbors to estimate a sample value in theoriginal image. In the frequency domain, frequency coefficients of theoriginal signal can be predicted from neighboring frequency coefficientsin the same frequency subband. In video applications for example, afrequency coefficient in a frame can be predicted from spatiallyneighboring coefficients within the same frame, or temporallyneighboring coefficients in adjacent frames or fields. In the spatialdomain, intensity values of a pixel can be estimated from intensityvalues of neighboring pixels. Having predicted the value of a signal inthe original, un-watermarked image, the reader then estimates thewatermark signal by calculating an inverse of the watermarking functionused to combine the watermark signal with the original signal.

[0089] For such watermark signal estimates, the reader uses theassignment map to find the corresponding raw bit position and imagesample in the carrier signal (506). The value of the raw bit is afunction of the watermark signal estimate, and the carrier signal at thecorresponding location in the carrier. To estimate the raw bit value,the reader solves for its value based on the carrier signal and thewatermark signal estimate. As reflected generally in FIG. 5 (508), theresult of this computation represents only one estimate to be analyzedalong with other estimates impacting the value of the corresponding rawbit. Some estimates may indicate that the raw bit is likely to be a one,while others may indicate that it is a zero. After the reader completesits scan, it compiles the estimates for each bit position in the raw bitstring, and makes a determination of the value of each bit at thatposition (510). Finally, it performs the inverse of the error correctioncoding scheme to construct the message (512). In some implementations,probablistic models may be employed to determine the likelihood that aparticular pattern of raw bits is just a random occurrence rather than awatermark.

[0090] 2.2.1 Example Illustrating Detector Process

[0091]FIG. 6 is a diagram depicting an example of a watermark detectionprocess. The detector segments the target image into blocks (e.g., 600,602) and then performs a 2-dimensional fast fourier transform (2D FFT)on several blocks. This process yields 2D transforms of the magnitudesof the image contents of the blocks in the spatial frequency domain asdepicted in the plot 604 shown in FIG. 6.

[0092] Next, the detector process performs a log polar remapping of eachtransformed block. The detector may add some of the blocks together toincrease the watermark signal to noise ratio. The type of remapping inthis implementation is referred to as a Fourier Mellin transform. TheFourier Mellin transform is a geometric transform that warps the imagedata from a frequency domain to a log polar coordinate system. Asdepicted in the plot 606 shown in FIG. 6, this transform sweeps throughthe transformed image data along a line at angle θ, mapping the data toa log polar coordinate system shown in the next plot 608. The log polarcoordinate system has a rotation axis, representing the angle θ, and ascale axis. Inspecting the transformed data at this stage, one can seethe orientation pattern of the watermark begin to be distinguishablefrom the noise component (i.e., the image signal).

[0093] Next, the detector performs a correlation 610 between thetransformed image block and the transformed orientation pattern 612. Ata high level, the correlation process slides the orientation patternover the transformed image (in a selected transform domain, such as aspatial frequency domain) and measures the correlation at an array ofdiscrete positions. Each such position has a corresponding scale androtation parameter associated with it. Ideally, there is a position thatclearly has the highest correlation relative to all of the others. Inpractice, there may be several candidates with a promising measure ofcorrelation. As explained further below, these candidates may besubjected to one or more additional correlation stages to select the onethat provides the best match.

[0094] There are a variety of ways to implement the correlation process.Any number of

pattern and performs a matching operation with the rotated and scaledpattern on the FFT of the target image. The matching operationmultiplies the values of the transformed pattern with sample values atcorresponding positions in the target image and accumulates the resultto yield a measure of the correlation. The detector repeats this processfor each of the candidates and picks the one with the highest measure ofcorrelation. As shown in FIG. 6, the rotation and scale parameters (614)of the selected candidate are then used to find additional parametersthat describe the orientation of the watermark in the target image.

[0095] The detector applies the scale and rotation to the target datablock 616 and then performs another correlation process between theorientation pattern 618 and the scaled and rotated data block 616. Thecorrelation process 620 is a generalized matching filter operation. Itprovides a measure of correlation for an array of positions that eachhas an associated translation parameter (e.g., an x, y position). Again,the detector may repeat the process of identifying promising candidates(i.e. those that reflect better correlation relative to others) andusing those in an additional search for a parameter or set oforientation parameters that provide a better measure of correlation.

[0096] At this point, the detector has recovered the followingorientation parameters: rotation, scale and translation. For manyapplications, these parameters may be sufficient to enable accuratereading of the watermark. In the read operation, the reader applies theorientation parameters to re-orient the target image and then proceedsto extract the watermark signal.

[0097] In some applications, the watermarked image may be stretched morein one spatial dimension than another. This type of distortion issometimes referred to as differential scale or shear. Consider that theoriginal image blocks are square. As a result of differential scale,each square may be warped into a parallelogram with unequal sides.Differential scale parameters define the nature and extent of thisstretching.

[0098] There are several alternative ways to recover the differentialscale parameters. One general class of techniques is to use the knownparameters (e.g., the computed scale, rotation, and translation) as astarting point to find the differential scale parameters. Assuming theknown parameters to be valid, this approach warps either the orientationpattern or the target image with selected amounts of differential scaleand picks the differential scale parameters that yield the bestcorrelation.

[0099] Another approach to determination of differential scale is setforth in application Ser. No. 09/452,022 (filed Nov. 30, 1999, andentitled Method and System for Determining Image Transformation,attorney docket 60057).

[0100] 2.2.2 Example Illustrating Reader Process

[0101]FIG. 7 is a diagram illustrating a re-oriented image 700superimposed onto the original watermarked image 702. The difference inorientation and scale shows how the image was transformed and editedafter the embedding process. The original watermarked image issub-divided into tiles (e.g., pixel blocks 704, 706, etc.). Whensuperimposed on the coordinate system of the original image 702 shown inFIG. 7, the target image blocks typically do not match the orientationof the original blocks.

[0102] The reader scans samples of the re-oriented image data,estimating the watermark information signal. It estimates the watermarkinformation signal, in part, by predicting original sample values of theun-watermarked image. The reader then uses an inverted form of thewatermarking function to estimate the watermark information signal fromthe watermarked signal and the predicted signal. This invertedwatermarking function expresses the estimate of the watermark signal asa function of the predicted signal and the watermarked signal. Having anestimate of the watermark signal, it then uses the known relationshipamong the carrier signal, the watermark signal, and the raw bit tocompute an estimate of the raw bit. Recall that samples in the watermarkinformation signal are a function of the carrier signal and the raw bitvalue. Thus, the reader may invert this function to solve for anestimate of the raw bit value.

[0103] Recall that the embedder implementation discussed in connectionwith FIG. 2 redundantly encodes the watermark information signal inblocks of the input signal. Each raw bit may map to several sampleswithin a block. In addition, the embedder repeats a mapping process foreach of the blocks. As such, the reader generates several estimates ofthe raw bit value as it scans the watermarked image.

[0104] The information encoded in the raw bit string can be used toincrease the accuracy of read operations. For instance, some of the rawbits act as signature bits that perform a validity checking function.Unlike unknown message bits, the reader knows the expected values ofthese signature bits. The reader can assess the validity of a readoperation based on the extent to which the extracted signature bitvalues match the expected signature bit values. The estimates for agiven raw bit value can then be given a higher weight depending onwhether they are derived from a tile with a greater measure of validity.

[0105] 3.0 Embedder Implementation:

[0106] The following sections describe an implementation of the digitalimage watermark embedder depicted in FIG. 8. The embedder inserts twowatermark components into the host image: a message component and adetection component (called the orientation pattern). The messagecomponent is defined in a spatial domain or other transform domain,while the orientation pattern is defined in a frequency domain. Asexplained later, the message component serves a dual function ofconveying a message and helping to identify the watermark location inthe image.

[0107] The embedder inserts the watermark message and orientationpattern in blocks of a selected color plane or planes (e.g., luminanceor chrominance plane) of the host image. The message payload varies fromone application to another, and can range from a single bit to thenumber of image samples in the domain in which it is embedded. Theblocks may be blocks of samples in a spatial domain or some othertransform domain.

[0108] 3.1 Encoding the Message

[0109] The embedder converts binary message bits into a series of binaryraw bits that it hides in the host image. As part of this process, amessage encoder 800 appends certain known bits to the message bits 802.It performs an error detection process (e.g., parity, Cyclic RedundancyCheck (CRC), etc.) to generate error detection bits and adds the errordetection bits to the message. An error correction coding operation thengenerates raw bits from the combined known and message bit string.

[0110] For the error correction operation, the embedder may employ anyof a variety of error correction codes such as Reed Solomon, BCH,convolution or turbo codes. The encoder may perform an M-ary modulationprocess on the message bits that maps groups of message bits to amessage signal based on an M-ary symbol alphabet.

[0111] In one application of the embedder, the component of the messagerepresenting the known bits is encoded more redundantly than the othermessage bits. This is an example of a shorter message component havinggreater signal strength than a longer, weaker message component. Theembedder gives priority to the known bits in this scheme because thereader uses them to verify that it has found the watermark in apotentially corrupted image, rather than a signal masquerading as thewatermark.

[0112] 3.2 Spread Spectrum Modulation

[0113] The embedder uses spread spectrum modulation as part of theprocess of creating a watermark signal from the raw bits. A spreadspectrum modulator 804 spreads each raw bit into a number of “chips.”The embedder generates a pseudo random number that acts as the carriersignal of the message. To spread each raw bit, the modulator performs anexclusive OR (XOR) operation between the raw bit and each bit of apseudo random binary number of a pre-determined length. The length ofthe pseudo random number depends, in part, on the size of the messageand the image. Preferably, the pseudo random number should containroughly the same number of zeros and ones, so that the net effect of theraw bit on the host image block is zero. If a bit value in the pseudorandom number is a one, the value of the raw bit is inverted.Conversely, if the bit value is a zero, then the value of the raw bitremains the same.

[0114] The length of the pseudorandom number may vary from one messagebit or symbol to another. By varying the length of the number, somemessage bits can be spread more than others.

[0115] 3.3 Scattering the Watermark Message

[0116] The embedder scatters each of the chips corresponding to a rawbit throughout an image block. An assignment map 806 assigns locationsin the block to the chips of each raw bit. Each raw bit is spread overseveral chips. As noted above, an image block may represent a block oftransform domain coefficients or samples in a spatial domain. Theassignment map may be used to encode some message bits or symbols (e.g.,groups of bits) more redundantly than others by mapping selected bits tomore locations in the host signal than other message bits. In addition,it may be used to map different messages, or different components of thesame message, to different locations in the host signal.

[0117]FIG. 9 depicts an example of the assignment map. Each of theblocks in FIG. 9 correspond to an image block and depict a pattern ofchips corresponding to a single raw bit. FIG. 9 depicts a total of 32example blocks. The pattern within a block is represented as black dotson a white background. Each of the patterns is mutually exclusive suchthat each raw bit maps to a pattern of unique locations relative to thepatterns of every other raw bit. Though not a requirement, the combinedpatterns, when overlapped, cover every location within the image block.

[0118] 3.4 Gain Control and Perceptual Analysis

[0119] To insert the information carried in a chip to the host image,the embedder alters the corresponding sample value in the host image. Inparticular, for a chip having a value of one, it adds to thecorresponding sample value, and for a chip having a value of zero, itsubtracts from the corresponding sample value. A gain controller in theembedder adjusts the extent to which each chip adds or subtracts fromthe corresponding sample value.

[0120] The gain controller takes into account the orientation patternwhen determining the gain. It applies a different gain to theorientation pattern than to the message component of the watermark.After applying the gain, the embedder combines the orientation patternand message components together to form the composite watermark signal,and combines the composite watermark with the image block. One way tocombine these signal components is to add them, but other linear ornon-linear functions may be used as well.

[0121] The orientation pattern is comprised of a pattern of quadsymmetric impulse functions in the spatial frequency domain. In thespatial domain, these impulse functions look like cosine waves. Anexample of the orientation pattern is depicted in FIGS. 10 and 11. FIG.10 shows the impulse functions as points in the spatial frequencydomain, while FIG. 11 shows the orientation pattern in the spatialdomain. Before adding the orientation pattern component to the messagecomponent, the embedder may transform the watermark components to acommon domain. For example, if the message component is in a spatialdomain and the orientation component is in a frequency domain, theembedder transforms the orientation component to a common spatial domainbefore combining them together.

[0122]FIG. 8 depicts the gain controller used in the embedder. Note thatthe gain controller operates on the blocks of image samples 808, themessage watermark signal, and a global gain input 810, which may bespecified by the user. A perceptual analyzer component 812 of the gaincontroller performs a perceptual analysis on the block to identifysamples that can tolerate a stronger watermark signal withoutsubstantially impacting visibility. In places where the naked eye isless likely to notice the watermark, the perceptual analyzer increasesthe strength of the watermark. Conversely, it decreases the watermarkstrength where the eye is more likely to notice the watermark.

[0123] The perceptual analyzer shown in FIG. 8 performs a series offiltering operations on the image block to compute an array of gainvalues. There are a variety of filters suitable for this task. Thesefilters include an edge detector filter that identifies edges of objectsin the image, a non-linear filter to map gain values into a desiredrange, and averaging or median filters to smooth the gain values. Eachof these filters may be implemented as a series of one-dimensionalfilters (one operating on rows and the other on columns) ortwo-dimensional filters. The size of the filters (i.e. the number ofsamples processed to compute a value for a given location) may vary(e.g., 3 by 3, 5 by 5, etc.). The shape of the filters may vary as well(e.g., square, cross-shaped, etc.). The perceptual analyzer processproduces a detailed gain multiplier. The multiplier is a vector withelements corresponding to samples in a block.

[0124] Another component 818 of the gain controller computes anasymmetric gain based on the output of the image sample values andmessage watermark signal. This component analyzes the samples of theblock to determine whether they are consistent with the message signal.The embedder reduces the gain for samples whose values relative toneighboring values are consistent with the message signal.

[0125] The embedder applies the asymmetric gain to increase the chancesof an accurate read in the watermark reader. To understand the effect ofthe asymmetric gain, it is helpful to explain the operation of thereader. The reader extracts the watermark message signal from thewatermarked signal using a predicted version of the original signal. Itestimates the watermark message signal value based on values of thepredicted signal and the watermarked signal at locations of thewatermarked signal suspected of containing a watermark signal. There areseveral ways to predict the original signal. One way is to compute alocal average of samples around the sample of interest. The average maybe computed by taking the average of vertically adjacent samples,horizontally adjacent samples, an average of samples in a cross-shapedfilter (both vertical and horizontal neighbors, an average of samples ina square-shaped filter, etc. The estimate may be computed one time basedon a single predicted value from one of these averaging computations.Alternatively, several estimates may be computed based on two or more ofthese averaging computations (e.g., one estimate for vertically adjacentsamples and another for horizontally adjacent samples). In the lattercase, the reader may keep estimates if they satisfy a similarity metric.In other words, the estimates are deemed valid if they within apredetermined value or have the same polarity.

[0126] Knowing this behavior of the reader, the embedder computes theasymmetric gain as follows. For samples that have values relative totheir neighbors that are consistent with the watermark signal, theembedder reduces the asymmetric gain. Conversely, for samples that areinconsistent with the watermark signal, the embedder increases theasymmetric gain. For example, if the chip value is a one, then thesample is consistent with the watermark signal if its value is greaterthan its neighbors. Alternatively, if the chip value is a zero, then thesample is consistent with the watermark signal if its value is less thanits neighbors.

[0127] Another component 820 of the gain controller computes adifferential gain, which represents an adjustment in the message vs.orientation pattern gains. As the global gain increases, the embedderemphasizes the message gain over the orientation pattern gain byadjusting the global gain by an adjustment factor. The inputs to thisprocess 820 include the global gain 810 and a message differential gain822. When the global gain is below a lower threshold, the adjustmentfactor is one. When the global gain is above an upper threshold, theadjustment factor is set to an upper limit greater than one. For globalgains falling within the two thresholds, the adjustment factor increaseslinearly between one and the upper limit. The message differential gainis the product of the adjustment factor and the global gain.

[0128] At this point, there are four sources of gain: the detailed gain,the global gain, the asymmetric gain, and the message dependent gain.The embedder applies the first two gain quantities to both the messageand orientation watermark signals. It only applies the latter two to themessage watermark signal. FIG. 8 depicts how the embedder applies thegain to the two watermark components. First, it multiplies the detailedgain with the global gain to compute the orientation pattern gain. Itthen multiplies the orientation pattern gain with the adjusted messagedifferential gain and asymmetric gain to form the composite messagegain.

[0129] Finally, the embedder forms the composite watermark signal. Itmultiplies the composite message gain with the message signal, andmultiplies the orientation pattern gain with the orientation patternsignal. It then combines the result in a common transform domain to getthe composite watermark. The embedder applies a watermarking function tocombine the composite watermark to the block to create a watermarkedimage block. The message and orientation components of the watermark maycombined by mapping the message bits to samples of the orientationsignal, and modulating the samples of the orientation signal to encodethe message.

[0130] The embedder computes the watermark message signal by convertingthe output of the assignment map 806 to delta values, indicating theextent to which the watermark signal changes the host signal. As notedabove, a chip value of one corresponds to an upward adjustment of thecorresponding sample, while a chip value of zero corresponds to adownward adjustment. The embedder specifies the specific amount ofadjustment by assigning a delta value to each of the watermark messagesamples (830).

[0131] 4.0 Detector Implementation

[0132]FIG. 12 illustrates an overview of a watermark detector thatdetects the presence of a detection watermark in a host image and itsorientation. Using the orientation pattern and the known bits insertedin the watermark message, the detector determines whether a potentiallycorrupted image contains a watermark, and if so, its orientation in theimage.

[0133] Recall that the composite watermark is replicated in blocks ofthe original image. After an embedder places the watermark in theoriginal digital image, the watermarked image is likely to undergoseveral transformations, either from routine processing or fromintentional tampering. Some of these transformations include:compression, decompression, color space conversion, digital to analogconversion, printing, scanning, analog to digital conversion, scaling,rotation, inversion, flipping differential scale, and lens distortion.In addition to these transformations, various noise sources can corruptthe watermark signal, such as fixed pattern noise, thermal noise, etc.

[0134] When building a detector implementation for a particularapplication, the developer may implement counter-measures to mitigatethe impact of the types of transformations, distortions and noiseexpected for that application. Some applications may require morecounter-measures than others. The detector described below is designedto recover a watermark from a watermarked image after the image has beenprinted, and scanned. The following sections describe thecounter-measures to mitigate the impact of various forms of corruption.The developer can select from among these counter-measures whenimplementing a detector for a particular application.

[0135] For some applications, the detector will operate in a system thatprovides multiple image frames of a watermarked object. One typicalexample of such a system is a computer equipped with a digital camera.In such a configuration, the digital camera can capture a temporalsequence of images as the user or some device presents the watermarkedimage to the camera.

[0136] As shown in FIG. 12, the principal components of the detectorare: 1) pre-processor 900; 2) rotation and scale estimator 902; 3)orientation parameter refiner 904; 4) translation estimator 906; 5)translation refiner 908; and reader 910.

[0137] The preprocessor 900 takes one or more frames of image data 912and produces a set of image blocks 914 prepared for further analysis.The rotation-scale estimator 902 computes rotation-scale vectors 916that estimate the orientation of the orientation signal in the imageblocks. The parameter refiner 904 collects additional evidence of theorientation signal and further refines the rotation scale vectorcandidates by estimating differential scale parameters. The result ofthis refining stage is a set of 4D vectors candidates 918 (rotation,scale, and two differential scale parameters). The translation estimator906 uses the 4D vector candidates to re-orient image blocks withpromising evidence of the orientation signal. It then finds estimates oftranslation parameters 920. The translation refiner 908 invokes thereader 910 to assess the merits of an orientation vector. When invokedby the detector, the reader uses the orientation vector to approximatethe original orientation of the host image and then extracts values forthe known bits in the watermark message. The detector uses thisinformation to assess the merits of and refine orientation vectorcandidates.

[0138] By comparing the extracted values of the known bits with theexpected values, the reader provides a figure of merit for anorientation vector candidate. The translation refiner then picks a 6Dvector, including rotation, scale, differential scale and translation,that appears likely produce a valid read of the watermark message 922.The following sections describe implementations of these components inmore detail.

[0139] 4.1 Detector Pre-processing

[0140]FIG. 13 is a flow diagram illustrating preprocessing operations inthe detector shown in FIG. 12. The detector performs a series ofpre-processing operations on the native image 930 to prepare the imagedata for further analysis. It begins by filling memory with one or moreframes of native image data (932), and selecting sets of pixel blocks934 from the native image data for further analysis (936). While thedetector can detect a watermark using a single image frame, it also hassupport for detecting the watermark using additional image frames. Asexplained below, the use of multiple frames has the potential forincreasing the chances of an accurate detection and read.

[0141] In applications where a camera captures an input image of awatermarked object, the detector may be optimized to address problemsresulting from movement of the object. Typical PC cameras, for example,are capable of capturing images at a rate of at least 10 frames asecond. A frustrated user might attempt to move the object in an attemptto improve detection. Rather than improving the chances of detection,the movement of the object changes the orientation of the watermark fromone frame to the next, potentially making the watermark more difficultto detect. One way to address this problem is to buffer one or moreframes, and then screen the frame or frames to determine if they arelikely to contain a valid watermark signal. If such screening indicatesthat a frame is not likely to contain a valid signal, the detector candiscard it and proceed to the next frame in the buffer, or buffer a newframe. Another enhancement is to isolate portions of a frame that aremost likely to have a valid watermark signal, and then perform moredetailed detection of the isolated portions.

[0142] After loading the image into the memory, the detector selectsimage blocks 934 for further analysis. It is not necessary to load orexamine each block in a frame because it is possible to extract thewatermark using only a portion of an image. The detector looks at only asubset of the samples in an image, and preferably analyzes samples thatare more likely to have a recoverable watermark signal.

[0143] The detector identifies portions of the image that are likely tohave the highest watermark signal to noise ratio. It then attempts todetect the watermark signal in the identified portions. In the contextof watermark detection, the host image is considered to be a source ofnoise along with conventional noise sources. While it is typically notpractical to compute the signal to noise ratio, the detector canevaluate attributes of the signal that are likely to evince a promisingwatermark signal to noise ratio. These properties include the signalactivity (as measured by sample variance, for example), and a measure ofthe edges (abrupt changes in image sample values) in an image block.Preferably, the signal activity of a candidate block should fall withinan acceptable range, and the block should not have a high concentrationof strong edges. One way to quantify the edges in the block is to use anedge detection filter (e.g., a LaPlacian, Sobel, etc.).

[0144] In one implementation, the detector divides the input image intoblocks, and analyzes each block based on pre-determined metrics. It thenranks the blocks according to these metrics. The detector then operateson the blocks in the order of the ranking. The metrics include samplevariance in a candidate block and a measure of the edges in the block.The detector combines these metrics for each candidate block to computea rank representing the probability that it contains a recoverablewatermark signal.

[0145] In another implementation, the detector selects a pattern ofblocks and evaluates each one to try to make the most accurate read fromthe available data. In either implementation, the block pattern and sizemay vary. This particular implementation selects a pattern ofoverlapping blocks (e.g., a row of horizontally aligned, overlappingblocks). One optimization of this approach is to adaptively select ablock pattern that increases the signal to noise ratio of the watermarksignal. While shown as one of the initial operations in the preparation,the selection of blocks can be postponed until later in thepre-processing stage.

[0146] Next, the detector performs a color space conversion on nativeimage data to compute an array of image samples in a selected colorspace for each block (936). In the following description, the colorspace is luminance, but the watermark may be encoded in one or moredifferent color spaces. The objective is to get a block of image sampleswith lowest noise practical for the application. While theimplementation currently performs a row by row conversion of the nativeimage data into 8 bit integer luminance values, it may be preferable toconvert to floating-point values for some applications. One optimizationis to select a luminance converter that is adapted for the sensor usedto capture the digital input image. For example, one mightexperimentally derive the lowest noise luminance conversion forcommercially available sensors, e.g., CCD cameras or scanners, CMOScameras, etc. Then, the detector could be programmed to select either adefault luminance converter, or one tuned to a specific type of sensor.

[0147] At one or more stages of the detector, it may be useful toperform operations to mitigate the impact of noise and distortion. Inthe pre-processing phase, for example, it may be useful to evaluatefixed pattern noise and mitigate its effect (938). The detector may lookfor fixed pattern noise in the native input data or the luminance data,and then mitigate it.

[0148] One way to mitigate certain types of noise is to combine datafrom different blocks in the same frame, or corresponding blocks indifferent frames 940. This process helps augment the watermark signalpresent in the blocks, while reducing the noise common to the blocks.For example, merely adding blocks together may mitigate the effects ofcommon noise.

[0149] In addition to common noise, other forms of noise may appear ineach of the blocks such as noise introduced in the printing or scanningprocesses. Depending on the nature of the application, it may beadvantageous to perform common noise recognition and removal at thisstage 942. The developer may select a filter or series of filters totarget certain types of noise that appear during experimentation withimages. Certain types of median filters may be effective in mitigatingthe impact of spectral peaks (e.g., speckles) introduced in printing orscanning operations.

[0150] In addition to introducing noise, the printing and image captureprocesses may transform the color or orientation of the original,watermarked image. As described above, the embedder typically operateson a digital image in a particular color space and at a desiredresolution. The watermark embedders normally operate on digital imagesrepresented in an RGB or CYMK color space at a desired resolution (e.g.,100 dpi or 300 dpi, the resolution at which the image is printed). Theimages are then printed on paper with a screen printing process thatuses the CYMK subtractive color space at a line per inch (LPI) rangingfrom 65-200. 133 lines/in is typical for quality magazines and 73lines/in is typical for newspapers. In order to produce a quality imageand avoid pixelization, the rule of thumb is to use digital images witha resolution that is at least twice the press resolution. This is due tothe half tone printing for color production. Also, different presses usescreens with different patterns and line orientations and have differentprecision for color registration.

[0151] One way to counteract the transforms introduced through theprinting process is to develop a model that characterizes thesetransforms and optimize watermark embedding and detecting based on thischaracterization. Such a model may be developed by passing watermarkedand unwatermarked images through the printing process and observing thechanges that occur to these images. The resulting model characterizesthe changes introduced due to the printing process. The model mayrepresent a transfer function that approximates the transforms due tothe printing process. The detector then implements a pre-processingstage that reverses or at least mitigates the effect of the printingprocess on watermarked images. The detector may implement apre-processing stage that performs the inverse of the transfer functionfor the printing process.

[0152] A related challenge is the variety in paper attributes used indifferent printing processes. Papers of various qualities, thickness andstiffness, absorb ink in various ways. Some papers absorb ink evenly,while others absorb ink at rates that vary with the changes in thepaper's texture and thickness. These variations may degrade the embeddedwatermark signal when a digitally watermarked image is printed. Thewatermark process can counteract these effects by classifying andcharacterizing paper so that the embedder and reader can compensate forthis printing-related degradation.

[0153] Variations in image capture processes also pose a challenge. Insome applications, it is necessary to address problems introduced due tointerlaced image data. Some video camera produce interlaced fieldsrepresenting the odd or even scan lines of a frame. Problems arise whenthe interlaced image data consists of fields from two consecutiveframes. To construct an entire frame, the preprocessor may combine thefields from consecutive frames while dealing with the distortion due tomotion that occurs from one frame to the next. For example, it may benecessary to shift one field before interleaving it with another fieldto counteract inter-frame motion. A de-blurring function may be used tomitigate the blurring effect due to the motion between frames.

[0154] Another problem associated with cameras in some applications isblurring due to the lack of focus. The preprocessor can mitigate thiseffect by estimating parameters of a blurring function and applying ade-blurring function to the input image.

[0155] Yet another problem associated with cameras is that they tend tohave color sensors that utilize different color pattern implementations.As such, a sensor may produce colors slightly different than thoserepresented in the object being captured. Most CCD and CMOS cameras usean array of sensors to produce colored images. The sensors in the arrayare arranged in clusters of sensitive to three primary colors red,green, and blue according to a specific pattern. Sensors designated fora particular color are dyed with that color to increase theirsensitivity to the designated color. Many camera manufacturers use aBayer color pattern GR/BG. While this pattern produces good imagequality, it causes color mis-registration that degrades the watermarksignal. Moreover, the color space converter, which maps the signal fromthe sensors to another color space such as YUV or RGB, may vary from onemanufacturer to another. One way to counteract the mis-registration ofthe camera's color pattern is to account for the distortion due to thepattern in a color transformation process, implemented either within thecamera itself, or as a pre-processing function in the detector.

[0156] Another challenge in counteracting the effects of the imagecapture process is dealing with the different types of distortionintroduced from various image capture devices. For example, cameras havedifferent sensitivities to light. In addition, their lenses havedifferent spherical distortion, and noise characteristics. Some scannershave poor color reproduction or introduce distortion in the image aspectratio. Some scanners introduce aliasing and employ interpolation toincrease resolution. The detector can counteract these effects in thepre-processor by using an appropriate inverse transfer function. Anoff-line process first characterizes the distortion of several differentimage capture devices (e.g., by passing test images through the scannerand deriving a transfer function modeling the scanner distortion). Somedetectors may be equipped with a library of such inverse transferfunctions from which they select one that corresponds to the particularimage capture device

[0157] Yet another challenge in applications where the image is printedon paper and later scanned is that the paper deteriorates over time anddegrades the watermark. Also, varying lighting conditions make thewatermark difficult to detect. Thus, the watermark may be selected so asto be more impervious to expected deterioration, and recoverable over awider range of lighting conditions.

[0158] At the close of the pre-processing stage, the detector hasselected a set of blocks for further processing. It then proceeds togather evidence of the orientation signal in these blocks, and estimatethe orientation parameters of promising orientation signal candidates.Since the image may have suffered various forms of corruption, thedetector may identify several parts of the image that appear to haveattributes similar to the orientation signal. As such, the detector mayhave to resolve potentially conflicting and ambiguous evidence of theorientation signal. To address this challenge, the detector estimatesorientation parameters, and then refines theses estimates to extract theorientation parameters that are more likely to evince a valid signalthan other parameter candidates.

[0159] 4.2 Estimating Initial Orientation Parameters

[0160]FIG. 14 is a flow diagram illustrating a process for estimatingrotation-scale vectors. The detector loops over each image block (950),calculating rotation-scale vectors with the best detection values ineach block. First, the detector filters the block in a manner that tendsto amplify the orientation signal while suppressing noise, includingnoise from the host image itself (952). Implemented as a multi-axisLaPlacian filter, the filter highlights edges (e.g., high frequencycomponents of the image) and then suppresses them. The term,“multi-axis,” means that the filter includes a series of stages thateach operates on particular axis. First, the filter operates on the rowsof luminance samples, then operates on the columns, and adds theresults. The filter may be applied along other axes as well. Each passof the filter produces values at discrete levels. The final result is anarray of samples, each having one of five values: {−2, −1, 0, 1,2}.

[0161] Next, the detector performs a windowing operation on the blockdata to prepare it for an FFT transform (954). This windowing operationprovides signal continuity at the block edges. The detector thenperforms an FFT (956) on the block, and retains only the magnitudecomponent (958).

[0162] In an alternative implementation, the detector may use the phasesignal produced by the FFT to estimate the translation parameter of theorientation signal. For example, the detector could use the rotation andscale parameters extracted in the process described below, and thencompute the phase that provided the highest measure of correlation withthe orientation signal using the phase component of the FFT process.

[0163] After computing the FFT, the detector applies a Fourier magnitudefilter (960) on the magnitude components. The filter in theimplementation slides over each sample in the Fourier magnitude arrayand filters the sample's eight neighbors in a square neighborhoodcentered at the sample. The filter boosts values representing a sharppeak with a rapid fall-off, and suppresses the fall-off portion. It alsoperforms a threshold operation to clip peaks to an upper threshold.

[0164] Next, the detector performs a log-polar re-sample (962) of thefiltered Fourier magnitude array to produce a log-polar array 964. Thistype of operation is sometimes referred to as a Fourier Mellintransform. The detector, or some off-line pre-processor, performs asimilar operation on the orientation signal to map it to the log-polarcoordinate system. Using matching filters, the detector implementationsearches for a orientation signal in a specified window of the log-polarcoordinate system. For example, consider that the log-polar coordinatesystem is a two dimensional space with the scale being the vertical axisand the angle being the horizontal axis. The window ranges from 0 to 90degrees on the horizontal axis and from approximately 50 to 2400 dpi onthe vertical axis. Note that the orientation pattern should be selectedso that routine scaling does not push the orientation pattern out ofthis window. The orientation pattern can be designed to mitigate thisproblem, as noted above, and as explained in co-pending patentapplication No. 60/136,572, filed May 28, 1999, by Ammon Gustafson,entitled Watermarking System With Improved Technique for DetectingScaling and Rotation, filed May 28, 1999.

[0165] The detector proceeds to correlate the orientation and the targetsignal in the log polar coordinate system. As shown in FIG. 14, thedetector uses a generalized matched filter GMF (966). The GMF performsan FFT on the orientation and target signal, multiplies the resultingFourier domain entities, and performs an inverse FFT. This processyields a rectangular array of values in log-polar coordinates, eachrepresenting a measure of correlation and having a correspondingrotation angle and scale vector. As an optimization, the detector mayalso perform the same correlation operations for distorted versions(968, 970, 972) of the orientation signal to see if any of the distortedorientation patterns results in a higher measure of correlation. Forexample, the detector may repeat the correlation operation with somepre-determined amount of horizontal and vertical differential distortion(970, 972). The result of this correlation process is an array ofcorrelation values 974 specifying the amount of correlation that eachcorresponding rotation-scale vector provides.

[0166] The detector processes this array to find the top M peaks andtheir location in the log-polar space 976. To extract the location moreaccurately, the detector uses interpolation to provide the inter-samplelocation of each of the top peaks 978. The interpolator computes the 2Dmedian of the samples around a peak and provides the location of thepeak center to an accuracy of 0.1 sample.

[0167] The detector proceeds to rank the top rotation-scale vectorsbased on yet another correlation process 980. In particular, thedetector performs a correlation between a Fourier magnituderepresentation for each rotation-scale vector candidate and a Fouriermagnitude specification of the orientation signal 982. Each Fouriermagnitude representation is scaled and rotated by an amount reflected bythe corresponding rotation-scale vector. This correlation operation sumsa point-wise multiplication of the orientation pattern impulse functionsin the frequency domain with the Fourier magnitude values of the imageat corresponding frequencies to compute a measure of correlation foreach peak 984. The detector then sorts correlation values for the peaks(986).

[0168] Finally, the detector computes a detection value for each peak(988). It computes the detection value by quantizing the correlationvalues. Specifically, it computes a ratio of the peak's correlationvalue and the correlation value of the next largest peak. Alternatively,the detector may compute the ratio of the peak's correlation value and asum or average of the correlation values of the next n highest peaks,where n is some predetermined number. Then, the detector maps this ratioto a detection value based on a statistical analysis of unmarked images.

[0169] The statistical analysis plots a distribution of peak ratiovalues found in unmarked images. The ratio values are mapped to adetection value based on the probability that the value came from anunmarked image. For example, 90% of the ratio values in unmarked imagesfall below a first threshold T1, and thus, the detection value mappingfor a ratio of T1 is set to 1. Similarly, 99% of the ratio values inunmarked images fall below T2, and therefore, the detection value is setto 2. 99.9% of the ratio values in unmarked images fall below T3, andthe corresponding detection value is set to 3. The threshold values, T1,T2 and T3, may be determined by performing a statistical analysis ofseveral images. The mapping of ratios to detection values based on thestatistical distribution may be implemented in a look up table.

[0170] The statistical analysis may also include a maximum likelihoodanalysis. In such an analysis, an off-line detector generates detectionvalue statistics for both marked and unmarked images. Based on theprobability distributions of marked and unmarked images, it determinesthe likelihood that a given detection value for an input imageoriginates from a marked and unmarked image.

[0171] At the end of these correlation stages, the detector has computeda ranked set of rotation-scale vectors 990, each with a quantizedmeasure of correlation associated with it. At this point, the detectorcould simply choose the rotation and scale vectors with the highest rankand proceed to compute other orientation parameters, such asdifferential scale and translation. Instead, the detector gathers moreevidence to refine the rotation-scale vector estimates. FIG. 15 is aflow diagram illustrating a process for refining the orientationparameters using evidence of the orientation signal collected fromblocks in the current frame.

[0172] Continuing in the current frame, the detector proceeds to comparethe rotation and scale parameters from different blocks (e.g., block 0,block 1, block 2; 1000, 1002, and 1004 in FIG. 15). In a processreferred to as interblock coincidence matching 1006, it looks forsimilarities between rotation-scale parameters that yielded the highestcorrelation in different blocks. To quantify this similarity, itcomputes the geometric distance between each peak in one block withevery other peak in the other blocks. It then computes the probabilitythat peaks will fall within this calculated distance. There are avariety of ways to calculate the probability. In one implementation, thedetector computes the geometric distance between two peaks, computes thecircular area encompassing the two peaks (π(geometric distance)²), andcomputes the ratio of this area to the total area of the block. Finally,it quantizes this probability measure for each pair of peaks (1008) bycomputing the log (base 10) of the ratio of the total area over the areaencompassing the two peaks. At this point, the detector has calculatedtwo detection values: quantized peak value, and the quantized distancemetric.

[0173] The detector now forms multi-block grouping of rotation-scalevectors and computes a combined detection value for each grouping(1010). The detector groups vectors based on their relative geometricproximity within their respective blocks. It then computes the combineddetection value by combining the detection values of the vectors in thegroup (1012). One way to compute a combined detection value is to addthe detection values or add a weighted combination of them.

[0174] Having calculated the combined detection values, the detectorsorts each grouping by its combined detection value (1014). This processproduces a set of the top groupings of unrefined rotation-scalecandidates, ranked by detection value 1016. Next, the detector weeds outrotation-scale vectors that are not promising by excluding thosegroupings whose combined detection values are below a threshold (the“refine threshold” 1018). The detector then refines each individualrotation-scale vector candidate within the remaining groupings.

[0175] The detector refines a rotation-scale vector by adjusting thevector and checking to see whether the adjustment results in a bettercorrelation. As noted above, the detector may simply pick the bestrotation-scale vector based on the evidence collected thus far, andrefine only that vector. An alternative approach is to refine each ofthe top rotation-scale vector candidates, and continue to gatherevidence for each candidate. In this approach, the detector loops overeach vector candidate (1020), refining each one.

[0176] One approach of refining the orientation vector is as follows:

[0177] fix the orientation signal impulse functions (“points”) within avalid boundary (1022);

[0178] pre-refine the rotation-scale vector (1024);

[0179] find the major axis and re-fix the orientation points (1026); and

[0180] refine each vector with the addition of a differential scalecomponent (1028).

[0181] In this approach, the detector pre-refines a rotation-scalevector by incrementally adjusting one of the parameters (scale, rotationangle), adjusting the orientation points, and then summing a point-wisemultiplication of the orientation pattern and the image block in theFourier magnitude domain. The refiner compares the resulting measure ofcorrelation with previous measures and continues to adjust one of theparameters so long as the correlation increases. After refining thescale and rotation angle parameters, the refiner finds the major axis,and re-fixes the orientation points. It then repeats the refiningprocess with the introduction of differential scale parameters. At theend of this process, the refiner has converted each scale-rotationcandidate to a refined 4D vector, including rotation, scale, and twodifferential scale parameters.

[0182] At this stage, the detector can pick a 4D vector or set of 4Dvector and proceed to calculate the final remaining parameter,translation. Alternatively, the detector can collect additional evidenceabout the merits of each 4D vector.

[0183] One way to collect additional evidence about each 4D vector is tore-compute the detection value of each orientation vector candidate(1030). For example, the detector may quantize the correlation valueassociated with each 4D vector as described above for the rotation-scalevector peaks (see item 988, FIG. 14 and accompanying text). Another wayto collect additional evidence is to repeat the coincidence matchingprocess for the 4D vectors. For this coincidence matching process, thedetector computes spatial domain vectors for each candidate (1032),determines the distance metric between candidates from different blocks,and then groups candidates from different blocks based on the distancemetrics (1034). The detector then re-sorts the groups according to theircombined detection values (1036) to produce a set of the top P groupings1038 for the frame.

[0184]FIG. 16 is a flow diagram illustrating a method for aggregatingevidence of the orientation signal from multiple frames. In applicationswith multiple frames, the detector collects the same information fororientation vectors of the selected blocks in each frame (namely, thetop P groupings of orientation vector candidates, e.g., 1050, 1052 and1054). The detector then repeats coincidence matching betweenorientation vectors of different frames (1056). In particular, in thisinter-frame mode, the detector quantizes the distance metrics computedbetween orientation vectors from blocks in different frames (1058). Itthen finds inter-frame groupings of orientation vectors (super-groups)using the same approach described above (1060), except that theorientation vectors are derived from blocks in different frames. Afterorganizing orientation vectors into super-groups, the detector computesa combined detection value for each super-group (1062) and sorts thesuper-groups by this detection value (1064). The detector then evaluateswhether to proceed to the next stage (1066), or repeat the above processof computing orientation vector candidates from another frame (1068).

[0185] If the detection values of one or more super-groups exceed athreshold, then the detector proceeds to the next stage. If not, thedetector gathers evidence of the orientation signal from another frameand returns to the inter-frame coincidence matching process. Ultimately,when the detector finds sufficient evidence to proceed to the nextstage, it selects the super-group with the highest combined detectionvalue (1070), and sorts the blocks based on their correspondingdetection values (1072) to produce a ranked set of blocks for the nextstage (1074).

[0186] 4.3 Estimating Translation Parameters

[0187]FIG. 17 is a flow diagram illustrating a method for estimatingtranslation parameters of the orientation signal, using informationgathered from the previous stages.

[0188] In this stage, the detector estimates translation parameters.These parameters indicate the starting point of a watermarked block inthe spatial domain. The translation parameters, along with rotation,scale and differential scale, form a complete 6D orientation vector. The6D vector enables the reader to extract luminance sample data inapproximately the same orientation as in the original watermarked image.

[0189] One approach is to use generalized match filtering to find thetranslation parameters that provide the best correlation. Anotherapproach is to continue to collect evidence about the orientation vectorcandidates, and provide a more comprehensive ranking of the orientationvectors based on all of the evidence gathered thus far. The followingparagraphs describe an example of this type of an approach.

[0190] To extract translation parameters, the detector proceeds asfollows. In the multi-frame case, the detector selects the frame thatproduced 4D orientation vectors with the highest detection values(1080). It then processes the blocks 1082 in that frame in the order oftheir detection value. For each block (1084), it applies the 4D vectorto the luminance data to generate rectified block data (1086). Thedetector then performs dual axis filtering (1088) and the windowfunction (1090) on the data. Next, it performs an FFT (1092) on theimage data to generate an array of Fourier data. To make correlationoperations more efficient, the detector buffers the fourier values atthe orientation points (1094).

[0191] The detector applies a generalized match filter 1096 to correlatea phase specification of the orientation signal (1098) with thetransformed block data. The result of this process is a 2D array ofcorrelation values. The peaks in this array represent the translationparameters with the highest correlation. The detector selects the toppeaks and then applies a median filter to determine the center of eachof these peaks. The center of the peak has a corresponding correlationvalue and sub-pixel translation value. This process is one example ofgetting translation parameters by correlating the Fourier phasespecification of the orientation signal and the image data. Othermethods of phase locking the image data with a synchronization signallike the orientation signal may also be employed.

[0192] Depending on the implementation, the detector may have to resolveadditional ambiguities, such as rotation angle and flip ambiguity. Thedegree of ambiguity in the rotation angle depends on the nature of theorientation signal. If the orientation signal is octally symmetric(symmetric about horizontal, vertical and diagonal axes in the spatialfrequency domain), then the detector has to check each quadrant (0-90,90-180, 180-270, and 270-360 degrees) to find out which one the rotationangle resides in. Similarly, if the orientation signal is quadsymmetric, then the detector has to check two cases, 0-180 and 180-270.

[0193] The flip ambiguity may exist in some applications where thewatermarked image can be flipped. To check for rotation and flipambiguities, the detector loops through each possible case, and performsthe correlation operation for each one (1100).

[0194] At the conclusion of the correlation process, the detector hasproduced a set of the top translation parameters with associatedcorrelation values for each block. To gather additional evidence, thedetector groups similar translation parameters from different blocks(1102), calculates a group detection value for each set of translationparameters 1104, and then ranks the top translation groups based ontheir corresponding group detection values 1106.

[0195] 4.4 Refining Translation Parameters

[0196] Having gathered translation parameter estimates, the detectorproceeds to refine these estimates. FIG. 18 is a flow diagramillustrating a process for refining orientation parameters. At thisstage, the detector process has gathered a set of the top translationparameter candidates 1120 for a given frame 1122. The translationparameters provide an estimate of a reference point that locates thewatermark, including both the orientation and message components, in theimage frame. In the implementation depicted here, the translationparameters are represented as horizontal and vertical offsets from areference point in the image block from which they were computed.

[0197] Recall that the detector has grouped translation parameters fromdifferent blocks based on their geometric proximity to each other. Eachpair of translation parameters in a group is associated with a block anda 4D vector (rotation, scale, and 2 differential scale parameters). Asshown in FIG. 18, the detector can now proceed to loop through eachgroup (1124), and through the blocks within each group (1126), to refinethe orientation parameters associated with each member of the groups.Alternatively, a simpler version of the detector may evaluate only thegroup with the highest detection value, or only selected blocks withinthat group.

[0198] Regardless of the number of candidates to be evaluated, theprocess of refining a given orientation vector candidate may beimplemented in a similar fashion. In the refining process, the detectoruses a candidate orientation vector to define a mesh of sample blocksfor further analysis (1128). In one implementation, for example, thedetector forms a mesh of 32 by 32 sample blocks centered around a seedblock whose upper right corner is located at the vertical and horizontaloffset specified by the candidate translation parameters. The detectorreads samples from each block using the orientation vector to extractluminance samples that approximate the original orientation of the hostimage at encoding time.

[0199] The detector steps through each block of samples (1130). For eachblock, it sets the orientation vector (1132), and then uses theorientation vector to check the validity of the watermark signal in thesample block. It assesses the validity of the watermark signal bycalculating a figure of merit for the block (1134). To further refinethe orientation parameters associated with each sample block, thedetector adjusts selected parameters (e.g., vertical and horizontaltranslation) and re-calculates the figure of merit. As depicted in theinner loop in FIG. 18 (block 1136 to 1132), the detector repeatedlyadjusts the orientation vector and calculates the figure of merit in anattempt to find a refined orientation that yields a higher figure ofmerit.

[0200] The loop (1136) may be implemented by stepping through apredetermined sequence of adjustments to parameters of the orientationvectors (e.g., adding or subtracting small increments from thehorizontal and vertical translation parameters). In this approach, thedetector exits the loop after stepping through the sequence ofadjustments. Upon exiting, the detector retains the orientation vectorwith the highest figure of merit.

[0201] There are a number of ways to calculate this figure of merit. Onefigure of merit is the degree of correlation between a known watermarksignal attribute and a corresponding attribute in the signal suspectedof having a watermark. Another figure of merit is the strength of thewatermark signal (or one of its components) in the suspect signal. Forexample, a figure of merit may be based on a measure of the watermarkmessage signal strength and/or orientation pattern signal strength inthe signal, or in a part of the signal from which the detector extractsthe orientation parameters. The detector may computes a figure of meritbased the strength of the watermark signal in a sample block. It mayalso compute a figure of merit based on the percentage agreement betweenthe known bits of the message and the message bits extracted from thesample block.

[0202] When the figure of merit is computed based on a portion of thesuspect signal, the detector and reader can use the figure of merit toassess the accuracy of the watermark signal detected and read from thatportion of the signal. This approach enables the detector to assess themerits of orientation parameters and to rank them based on their figureof merit. In addition, the reader can weight estimates of watermarkmessage values based on the figure of merit to recover a message morereliably.

[0203] The process of calculating a figure of merit depends onattributes the watermark signal and how the embedder inserted it intothe host signal. Consider an example where the watermark signal is addedto the host signal. To calculate a figure of merit based on the strengthof the orientation signal, the detector checks the value of each samplerelative to its neighbors, and compares the result with thecorresponding sample in a spatial domain version of the orientationsignal. When a sample's value is greater than its neighbors, then onewould expect that the corresponding orientation signal sample to bepositive. Conversely, when the sample's value is less than itsneighbors, then one would expect that the corresponding orientationsample to be negative. By comparing a sample's polarity relative to itsneighbors with the corresponding orientation sample's polarity, thedetector can assess the strength of the orientation signal in the sampleblock. In one implementation, the detector makes this polaritycomparison twice for each sample in an N by N block (e.g., N=32, 64,etc): once comparing each sample with its horizontally adjacentneighbors and then again comparing each sample with its verticallyadjacent neighbors. The detector performs this analysis on samples inthe mesh block after re-orienting the data to approximate the originalorientation of the host image at encoding time. The result of thisprocess is a number reflecting the portion of the total polaritycomparisons that yield a match.

[0204] To calculate a figure of merit based on known signature bits in amessage, the detector invokes the reader on the sample block, andprovides the orientation vector to enable the reader to extract codedmessage bits from the sample block. The detector compares the extractedmessage bits with the known bits to determine the extent to which theymatch. The result of this process is a percentage agreement numberreflecting the portion of the extracted message bits that match theknown bits. Together the test for the orientation signal and the messagesignal provide a figure of merit for the block.

[0205] As depicted in the loop from blocks 1138 to 1130, the detectormay repeat the process of refining the orientation vector for eachsample block around the seed block. In this case, the detector exits theloop (1138) after analyzing each of the sample blocks in the meshdefined previously (1128). In addition, the detector may repeat theanalysis in the loop through all blocks in a given group (1140), and inthe loop through each group (1142).

[0206] After completing the analysis of the orientation vectorcandidates, the detector proceeds to compute a combined detection valuefor the various candidates by compiling the results of the figure ofmerit calculations. It then proceeds to invoke the reader on theorientation vector candidates in the order of their detection values.

[0207] 4.5 Reading the Watermark

[0208]FIG. 19 is a flow diagram illustrating a process for reading thewatermark message. Given an orientation vector and the correspondingimage data, the reader extracts the raw bits of a message from theimage. The reader may accumulate evidence of the raw bit values fromseveral different blocks. For example, in the process depicted in FIG.19, the reader uses refined orientation vectors for each block, andaccumulates evidence of the raw bit values extracted from the blocksassociated with the refined orientation vectors.

[0209] The reading process begins with a set of promising orientationvector candidates 1150 gathered from the detector. In each group oforientation vector candidates, there is a set of orientation vectors,each corresponding to a block in a given frame. The detector invokes thereader for one or more orientation vector groups whose detection valuesexceed a predetermined threshold. For each such group, the detectorloops over the blocks in the group (1152), and invokes the reader toextract evidence of the raw message bit values.

[0210] Recall that previous stages in the detector have refinedorientation vectors to be used for the blocks of a group. When itinvokes the reader, the detector provides the orientation vector as wellas the image block data (1154). The reader scans samples starting from alocation in a block specified by the translation parameters and usingthe other orientation parameters to approximate the original orientationof the image data (1156).

[0211] As described above, the embedder maps chips of the raw messagebits to each of the luminance samples in the original host image. Eachsample, therefore, may provide an estimate of a chip's value. The readerreconstructs the value of the chip by first predicting the watermarksignal in the sample from the value of the sample relative to itsneighbors as described above (1158). If the deduced value appears valid,then the reader extracts the chip's value using the known value of thepseudo-random carrier signal for that sample and performing the inverseof the modulation function originally used to compute the watermarkinformation signal (1160). In particular, the reader performs anexclusive OR operation on the deduced value and the known carrier signalbit to get an estimate of the raw bit value. This estimate serves as anestimate for the raw bit value. The reader accumulates these estimatesfor each raw bit value (1162).

[0212] As noted above, the reader computes an estimate of the watermarksignal by predicting the original, un-watermarked signal and deriving anestimate of the watermark signal based on the predicted signal and thewatermarked signal. It then computes an estimate of a raw bit valuebased on the value of the carrier signal, the assignment map that maps araw bit to the host image, and the relationship among the carrier signalvalue, the raw bit value, and the watermark signal value. In short, thereader reverses the embedding functions that modulate the message withthe carrier and apply the modulated carrier to the host signal. Usingthe predicted value of the original signal and an estimate of thewatermark signal, the reader reverses the embedding functions toestimate a value of the raw bit.

[0213] The reader loops over the candidate orientation vectors andassociated blocks, accumulating estimates for each raw bit value (1164).When the loop is complete, the reader calculates a final estimate valuefor each raw bit from the estimates compiled for it. It then performsthe inverse of the error correction coding operation on the final rawbit values (1166). Next, it performs a CRC to determine whether the readis valid. If no errors are detected, the read operation is complete andthe reader returns the message (1168).

[0214] However, if the read is invalid, then the detector may eitherattempt to refine the orientation vector data further, or start thedetection process with a new frame. Preferably, the detector shouldproceed to refine the orientation vector data when the combineddetection value of the top candidates indicates that the current data islikely to contain a strong watermark signal. In the process depicted inFIG. 19, for example, the detector selects a processing path based onthe combined detection value (1170). The combined detection value may becalculated in a variety of ways. One approach is to compute a combineddetection value based on the geometric coincidence of the toporientation vector candidates and a compilation of their figures ofmerit. The figure of merit may be computed as detailed earlier.

[0215] For cases where the read is invalid, the processing paths for theprocess depicted in FIG. 19 include: 1) refine the top orientationvectors in the spatial domain (1172); 2) invoke the translationestimator on the frame with the next best orientation vector candidates(1174); and 3) re-start the detection process on a new frame (assumingan implementation where more than one frame is available)(1176). Thesepaths are ranked in order from the highest detection value to thelowest. In the first case, the orientation vectors are the mostpromising. Thus, the detector re-invokes the reader on the samecandidates after refining them in the spatial domain (1178). In thesecond case, the orientation vectors are less promising, yet thedetection value indicates that it is still worthwhile to return to thetranslation estimation stage and continue from that point. Finally, inthe final case, the detection value indicates that the watermark signalis not strong enough to warrant further refinement. In this case, thedetector starts over with the next new frame of image data.

[0216] In each of the above cases, the detector continues to process theimage data until it either makes a valid read, or has failed to make avalid read after repeated passes through the available image data.

[0217] 5.0 Operating Environment for Computer Implementations

[0218]FIG. 20 illustrates an example of a computer system that serves asan operating environment for software implementations of thewatermarking systems described above. The embedder and detectorimplementations are implemented in C/C++ and are portable to manydifferent computer systems. FIG. 20 generally depicts one such system.

[0219] The computer system shown in FIG. 20 includes a computer 1220,including a processing unit 1221, a system memory 1222, and a system bus1223 that interconnects various system components including the systemmemory to the processing unit 1221.

[0220] The system bus may comprise any of several types of busstructures including a memory bus or memory controller, a peripheralbus, and a local bus using a bus architecture such as PCI, VESA,Microchannel (MCA), ISA and EISA, to name a few.

[0221] The system memory includes read only memory (ROM) 1224 and randomaccess memory (RAM) 1225. A basic input/output system 1226 (BIOS),containing the basic routines that help to transfer information betweenelements within the computer 1220, such as during start-up, is stored inROM 1224.

[0222] The computer 1220 further includes a hard disk drive 1227, amagnetic disk drive 1228, e.g., to read from or write to a removabledisk 1229, and an optical disk drive 1230, e.g., for reading a CD-ROM orDVD disk 1231 or to read from or write to other optical media. The harddisk drive 1227, magnetic disk drive 1228, and optical disk drive 1230are connected to the system bus 1223 by a hard disk drive interface1232, a magnetic disk drive interface 1233, and an optical driveinterface 1234, respectively. The drives and their associatedcomputer-readable media provide nonvolatile storage of data, datastructures, computer-executable instructions (program code such asdynamic link libraries, and executable files), etc. for the computer1220.

[0223] Although the description of computer-readable media above refersto a hard disk, a removable magnetic disk and an optical disk, it canalso include other types of media that are readable by a computer, suchas magnetic cassettes, flash memory cards, digital video disks, and thelike.

[0224] A number of program modules may be stored in the drives and RAM1225, including an operating system 1235, one or more applicationprograms 1236, other program modules 1237, and program data 1238.

[0225] A user may enter commands and information into the computer 1220through a keyboard 1240 and pointing device, such as a mouse 1242. Otherinput devices may include a microphone, joystick, game pad, satellitedish, digital camera, scanner, or the like. A digital camera or scanner43 may be used to capture the target image for the detection processdescribed above. The camera and scanner are each connected to thecomputer via a standard interface 44. Currently, there are digitalcameras designed to interface with a Universal Serial Bus (USB),Peripheral Component Interconnect (PCI), and parallel port interface.Two emerging standard peripheral interfaces for cameras include USB2 and1394 (also known as firewire and iLink).

[0226] Other input devices may be connected to the processing unit 1221through a serial port interface 1246 or other port interfaces (e.g., aparallel port, game port or a universal serial bus (USB)) that arecoupled to the system bus.

[0227] A monitor 1247 or other type of display device is also connectedto the system bus 1223 via an interface, such as a video adapter 1248.In addition to the monitor, computers typically include other peripheraloutput devices (not shown), such as speakers and printers.

[0228] The computer 1220 operates in a networked environment usinglogical connections to one or more remote computers, such as a remotecomputer 1249. The remote computer 1249 may be a server, a router, apeer device or other common network node, and typically includes many orall of the elements described relative to the computer 1220, althoughonly a memory storage device 1250 has been illustrated in FIG. 20. Thelogical connections depicted in FIG. 20 include a local area network(LAN) 1251 and a wide area network (WAN) 1252. Such networkingenvironments are commonplace in offices, enterprise-wide computernetworks, intranets and the Internet.

[0229] When used in a LAN networking environment, the computer 1220 isconnected to the local network 1251 through a network interface oradapter 1253. When used in a WAN networking environment, the computer1220 typically includes a modem 1254 or other means for establishingcommunications over the wide area network 1252, such as the Internet.The modem 1254, which may be internal or external, is connected to thesystem bus 1223 via the serial port interface 1246.

[0230] In a networked environment, program modules depicted relative tothe computer 1220, or portions of them, may be stored in the remotememory storage device. The processes detailed above can be implementedin a distributed fashion, and as parallel processes. It will beappreciated that the network connections shown are exemplary and thatother means of establishing a communications link between the computersmay be used.

[0231] While the computer architecture depicted in FIG. 20 is similar totypical personal computer architectures, aspects of the invention may beimplemented in other computer

Personal Digital Assistants, audio

elet Decomposition

watermark in the lowest frequency

This approach may be used to encode a

rks. One example is a watermarking

ection of impulse functions in the

n of a host signal and embeds a

ding process can be summarized as

media signal into low frequency subband

quency subbands;

information, such as the orientation

nstruct a composite media signal.

extent of the modification to the low

bband to extract auxiliary information.

watermark methods detailed above. The

low frequency subband of the wavelet

is similar to the process described

pre-processing step to compute the low

et transform as a post processing step to

n.

hat the wavelet decomposition is a pre-

orientation signal on the low frequency

[0232] Other Wavelet Domain Watermark Methods

[0233] The following sections describe additional wavelet domainwatermarking methods. These wavelet-based techniques may be used toimplement new wavelet based watermark encoding and decoding methods orto enhance existing watermark encoding and decoding methods. As such,the following sections describe wavelet-based watermark encoding anddecoding methods as well as wavelet based signal processing methods thatmay be used to enhance other types of watermark methods.

[0234] Wavelet-Based Watermark Embedding

[0235]FIG. 21 is a block diagram of a wavelet based watermark embeddingprocess. In this process, the watermark to be embedded includes amessage signal component 1300 and an orientation signal component 1302.Both are encoded so as to be imperceptible in a watermarked image. Theorientation signal component is used to detect the watermark and tocompute its orientation in a signal suspected of containing a watermark.The message signal component carries a message of one or more symbols.Each symbol may be binary, M-ary, etc.

[0236] To embed the message in the host signal, the embedder transformsa message into the message signal and embeds this signal component intothe host image. There are many different ways to compute a messagesignal and embed it into the host image. One way is to spread each bitof a binary message over a pseudorandom number using an exclusive OR ormultiplication operation between each bit value of the message and acorresponding pseudorandom number. A message symbol value of binary 1may be represented as a 1 and binary 0 as a −1. This spreading processcreates a message signal comprising an array of positive and negativevalues. Each of the resulting message signal samples may be mapped toone or more wavelet coefficients, and then embedded by adjusting thecoefficients according to a predetermined embedding function. For eachmessage signal sample, evaluation of the embedding function (or itsinverse) for a corresponding marked coefficient or coefficients producesa value that corresponds to the desired sample value.

[0237] As shown in FIG. 21, the Discrete Wavelet Transform (DWT) process1304 transforms an image 1306 to be embedded to a wavelet transformdomain. Unless already defined in the wavelet transform domain, theorientation signal is transformed (1308) into the wavelet domain toprepare it for encoding in that domain.

[0238] In some implementations, the message and orientation signal areintegrated into a single watermark signal. Some attributes of thewatermark signal may be used to detect the watermark and determine itsorientation, while others may be used to convey one or more symbols(e.g., binary bits or M-ary symbols) of a message. One example of anattribute used to detect the watermark is a pattern of impulse functionsin the Fourier magnitude domain, where the impulse functions havepseudorandom phase. One example of an attribute used to convey a symbolis the magnitude of the watermark signal, its polarity, or a valuecomputed as a function of the watermarked signal. For example, thepresence or absence of a signal sample at the position of one of theseimpulse functions may correspond to a binary zero or one message symbol.Alternatively, the magnitude or polarity (1 corresponds to a binary 1and −1 corresponds to a binary 0) of the watermark sample or set ofsamples may correspond to binary one or zero message symbol.

[0239] To control perceptibility of the watermark, the encoder uses ahuman visual system (HVS) model 1310 to analyze the host image anddetermine how to control encoding of the watermark to reduceperceptibility of the watermark. As shown in FIG. 21, an image adaptivemodeling process uses criteria established in the HVS model and appliesthis criteria to the wavelet transform of the image to determine itsperceptual masking capability. The output of the modeling processprovides a control signal 1312 that adapts the watermark to the hostimage based on the computed perceptual masking capability. The controlsignal may be implemented as a gain vector with elements that adjust thestrength of corresponding elements of the watermark signal to adapt itto the host image. Different gain vectors may be applied to theorientation and message signal components (1314, 1316).

[0240] The encoding module combines the watermark signal, includingmessage and orientation signal components (if applicable), to thewavelet domain transform of the host image 1318. The function used tocombine the signals may be linear, such as a simple adding ofcorresponding elements of each signal. Alternatively, the function maybe non-linear, such as adjusting a wavelet coefficient or set ofcoefficients so that a function of those coefficients conform to a valuerepresenting a message symbol value being embedded. The encoder thenapplies an inverse wavelet transform 1320 to the combined signal tocreate a watermarked image.

[0241] Wavelet Based Watermark Detection

[0242]FIG. 22 is a block diagram of a watermark decoding process for awavelet domain watermark. The input to this process is an imagesuspected of containing a watermark (suspect image). Initially, thewatermark decoder screens the suspect image to determine whether awatermark is present, and to compute the orientation of the watermark inthe suspect signal. First, the decoder performs a wavelet domaintransform (DWT) on the suspect image 1400.

[0243] As part of the screening process, the decoder optionally performsa pre-filtering operation 1402 on the image. The purpose of thepre-filtering operation is to enhance watermark detection by removingparts of the host signal that are not likely to contain the watermarksignal. As such, the pre-filtering operation filters out parts of theimage signal that are not likely to contain the watermark orientationsignal.

[0244] The DWT of the image may contribute to the pre-filtering processby helping to isolate the watermark signal. In particular, if thewatermark orientation signal is known to be embedded in selected bands,or at least embedded more strongly in selected bands, then the watermarkdetector selects those bands and performs detection operations on them.

[0245] After applying the DWT, a filter may be applied to remove subbandsamples below a threshold. The threshold is derived from the signalitself by estimating it from subband samples of a predetermined lowresolution cutoff.

[0246] Another operation that can improve detection in someimplementations is to add blocks of subband samples together to increasethe signal to noise ratio of the watermark signal. This block additionprocess increases signal to noise ratio in implementations where thewatermark signal is repeated in wavelet coefficients throughout theimage. For example, if the orientation signal is repeated throughout theimage, the signal to noise ratio of the watermark orientation signal maybe enhanced by summing groups of wavelet coefficients. By adding blocksof subband samples together, the strength of the common watermark signalin the summed blocks increases.

[0247] The watermark decoder optionally performs a normalization of thesubband samples. The normalization may be implemented as part of thepre-filtering process or as a separate process. Normalization is used tomake subsequent detection operation less sensitive to noise. Examples ofnormalization processes include zero mean unity variance normalizationor dividing each sample by a local average (e.g., an average of samplesin a 3 by 3, 5 by 5, etc. neighborhood around each subband sample).

[0248] The watermark then performs a multi-level correlation to detect awatermark and determine its orientation. Multi-level correlation refersto a correlation process that correlates a reference signal with thesuspect signal at two or more levels of a wavelet decomposition. Anexample of this type of correlation is detailed further below. Themulti-level correlation process computes an initial estimate of awatermark's orientation at an initial low level of resolution of thewavelet decomposition. It then refines the orientation at higher levelsof resolution. In a two dimensional signal like an image, theorientation may be represented as one or more orientation parameters,such as the rotational angle, scale, translation (e.g., origincoordinates), differential scale, shear, affine transform coefficients,etc.

[0249] As detailed further below, the multi-level correlation processmay be used alone or combination with other detection operations tocalculate orientation parameters. Certain types of wavelet transformsare not translation invariant. A translation invariant wavelet transform(e.g., redundant DWT) may be used to compute translation parameters onwavelet domain samples. Alternatively, the detector may transform thewatermarked signal into other domains that are translation invariant andcompute translation in that domain.

[0250] The correlation process screens suspect signals that are unlikelyto contain a watermark signal. One way to screen the suspect signal isto set thresholds for the amount of correlation measured at each levelof resolution. If the measurement of correlation exceeds a threshold,the decoding process continues to refine the orientation parameters(e.g., rotation, scale, translation, etc.) of the watermark. Otherwise,it indicates that the suspect signal does not contain a recoverablewatermark.

[0251] Having successfully computed the orientation of the watermarksignal, the decoder proceeds to decode a message from the watermark. Thedecoder uses the orientation parameters to orient the image data formessage reading operations. The details of the message reading processvary from one implementation to another. In some implementations, thedecoder may extract a message from a residual signal after predictingand removing a predicted version of the original un-watermarked signal.Alternatively, the decoder may perform a message decoding by evaluatingthe embedding function (or its inverse) directly on the re-orientedimage data. In implementations where the orientation signal carries amessage, the decoder decodes the message from the orientation signal.

[0252] Types of Wavelet Transforms

[0253] Wavelet transforms have a number of attributes that make themuseful for watermark applications. This section describes some of theseattributes and suggests how one might select a type of wavelet domaintransform for watermark applications.

[0254] The wavelet transform provides spatial and frequencylocalization. In other words, spatial information about the originalsignal is maintained in the coefficients of the wavelet decomposition.Another attribute of the wavelet transform is that feature extractioncan be performed in the wavelet transform domain. For example, edges canbe detected and quantified in the wavelet domain. Another attribute isthat energy is typically compacted in low frequency subbands.

[0255] There are many types of wavelet transforms. Some examples includeHaar, Daubechier, Symlet, Morlet, Mexican Hat, and Meyer. The typeselected for a watermarking application depends on the particular needsof that application. There are a number of criteria for selection thewavelet transform used in a watermark encoding and decoding application.These include: regularity, symmetry, orthogonality, size of support,speed, and reconstruction.

[0256] Regularity is a mathematical property of a transform that refersto its differentiability. If the transform is easily differentiable,meaning that it has few discontinuities, then it has good regularity.Regularity allows for smooth reconstruction of a wavelet decomposedsignal.

[0257] Symmetry is a property that relates to the extent to which thetransform introduces phase error. A symmetric filter is also referred toas having linear phase. Such a filter introduces no phase error.

[0258] Orthogonality and biorthogonality are properties of a transformthat enable fast decomposition and more accurate reconstruction.

[0259] A redundant wavelet transform, as noted above, is translation or“shift” invariant. Redundant wavelet transforms, therefore, are usefulin watermark applications where the detector computes translationparameters in the wavelet domain.

[0260] The directional wavelet transform retains directionalinformation, and as such is useful in detectors that compute orientationparameters in the wavelet domain.

[0261] The size of the support refers to the number of coefficients inthe filter. The coefficients are also referred to as the filter taps. Amore compact filter (e.g., fewer taps) is preferred, but not required.

[0262] Encoding a Watermark in the Wavelet Domain

[0263] Previous sections described approaches for watermark encoding inthe wavelet domain. This section discusses additional aspects ofencoding in the wavelet domain.

[0264] Band Selection

[0265] Band selection refers a process of selecting which subband orsubbands to encode and decode a watermark signal. Bands should beselected so as to reduce perceptibility of the watermark, while makingthe watermark robust to expected forms of manipulation, such ascompression, digital to analog and analog to digital conversion,geometric transformation, etc.

[0266] The bands may be selected adaptively in a watermark embedder anddetector. For example, the embedder may analyze bands to identify thosethat have better masking properties and embed the watermark, orcomponents of it, in those bands. Similarly, the detector can perform asimilar analysis to isolate bands for detection operations.

[0267] Another aspect of selecting bands for watermarking is determiningwhere to embed, detect and decode different components of the watermark.For example, a watermark orientation signal may be located in adifferent band from a watermark message signal to minimize interferencebetween the two watermark signal components.

[0268]FIG. 23 shows a diagram of one possible wavelet decomposition ofan image. The image is hierarchically decomposed into four levels: anapproximate A3 (the lowest level of the decomposition), and threeadditional levels, each having a vertical (V), horizontal (H) anddiagonal (D) band. The size of each block depicts the resolution of thecorresponding band: the smaller the block, the lower the resolution ofthat block.

[0269] Preferably, watermarks should be inserted in bands that have agreater potential for masking the watermark and that will survive commonsignal processing operations. Bands with greater average energy andenergy compaction are good candidates for watermark embedding.

[0270] In implementations where there are separate orientation andmessage signal components of the watermark, the orientation and messagesignal components can be made orthogonal by embedding them in differentbands. Such orthogonal embedding of the watermark components avoids themfrom interfering with each other in decoding operations.

[0271] Coefficient Selection

[0272] The watermark signal should be encoded in coefficients of aselected band or bands that have good masking properties and that makethe watermark more robust to manipulation. These coefficients tend to bein the mid to lower levels of the wavelet decomposition.

[0273] Adjusting Coefficient Values

[0274] As noted previously, a watermark signal may be encoded using anembedding function that adjusts one or more coefficients to encode awatermark signal sample. The embedding function may add a watermarksignal sample to a corresponding coefficient or set of coefficients inone or more bands. Other embedding functions may be used as well. Forexample, message symbols or corresponding watermark signal sample valuesmay correspond to quantization functions. The quantization functioncorresponding to a particular symbol is applied to the coefficient orcoefficients that correspond to that symbol. Message symbols orcorresponding watermark signal sample values may also be encoded byadjusting two or more coefficients so that a function applied to thoseadjusted coefficients produces a value corresponding to the desiredsymbol or watermark signal value.

[0275] Message Spreading

[0276] As noted previously, a watermark message may be spread over awideband carrier signal. One example is spreading each bit of a binarymessage over pseudorandom number. The resulting sample values of thewatermark message signal are mapped to wavelet coefficients or sets ofcoefficients. The encoder then applies an embedding function to embedthe watermark message samples into corresponding wavelet coefficients.

[0277] Perceptibility Modeling

[0278] The watermark encoder shown in FIG. 21 employs image adaptiveperceptual modeling. This section describe various methods to implementthis modeling. The human visual system model shown in FIG. 21 representsa model of a human's visual sensitivity to changes in an image. Theremay be several components of this model, such as:

[0279] 1. Sensitivity to changes at different frequencies; and

[0280] 2. Sensitivity to textures or highly busy or active image areas.

[0281] 3. Sensitivity to local contrast;

[0282] 4. Sensitivity to changes in different colors;

[0283] 5. Sensitivity to edges, including directional edges;

[0284] For each of these components or some combination of them, thehuman visual model specifies how to adapt the watermark signal to theimage. The perceptual analyzer in the encoder computes attributes of theimage and applies a HVS model to these attributes to compute controldata, such as a scale vector, that adapts the watermark signal to thehost image. Attributes of the host image that impact the perceptualanalysis may be evaluated in the wavelet domain or in some other domain.

[0285] The wavelet domain transform decomposes the host signal intosubbands at different frequencies. As such, the wavelet transforminherently performs a frequency analysis to the image. The perceptualanalyzer may select the coefficients in which the watermark signal isembedded, or adjust the strength of the watermark signal for frequenciesthat are less visually sensitive.

[0286] Image activity over a predetermined area of an image may also becomputed in the wavelet domain. For example, measurements of signalenergy and image edges per a given area of the image can be measured inthe wavelet domain. The perceptual analyzer transforms thesemeasurements of signal activity to a corresponding control value thatadjusts the strength of the watermark signal according to signalactivity in that area.

[0287] The encoder may use a model of a human's sensitivity to contrastto map a measurement of local contrast in an area of an image to acontrol value that controls the strength of the watermark signal in thatarea. The strength of the watermark signal applied to the waveletcoefficients in that area is then adjusted based on the control value.The model of visual sensitivity to contrast may be derivedexperimentally by determining when changes to a signal at each ofseveral different contrast levels become visible.

[0288] The encoder may also apply a model of sensitivity to colors. Thewatermark signal may be embedded by adjusting luminance values. Theseadjustments may be effected by making changes to color values that makethe desired change to luminance. Changes of some colors may be morevisible than others. As such, the perceptual analyzer may reduce ormodify changes to certain wavelet coefficients based on sensitivity tocolor changes in an area of the image where those coefficients arelocated.

[0289] Encoding a Synchronization or Orientation Signal

[0290] As described above, the watermark may include a separateorientation or synchronization signal. One form of orientation signal isone that is defined as an array of impulse functions in the Fouriermagnitude domain with pseudorandom phase. In a wavelet domain watermarkencoding method, the orientation signal may be located in selectedsubbands that have a desired masking capability.

[0291] To avoid interference between the message signal and theorientation signal, the encoder may locate these signal components indistinct subbands or subband coefficients.

[0292] The watermark signal may perform functions of carrying a messageand facilitating computation of orientation.

[0293] Wavelet Based Watermark Decoding

[0294]FIG. 24 is a diagram depicting a process for multi-level detectionof a watermark. There are two primary inputs: a reference signal and animage suspected of containing a watermark. A wavelet domain transform isapplied to each of these signals to produce a wavelet decomposition.Since the reference signal is known, its wavelet decomposition can bepre-computed and stored in the decoder.

[0295] As shown in FIG. 24, a feature extractor 1500 operates on thewavelet decomposition of the suspect image 1502. The feature extractorperforms a pre-filtering of the wavelet coefficients to extract a signalthat is likely to contain the watermark signal (e.g., a signalrepresenting the largest N wavelet coefficients in a band, where N is apredetermined number for that band). The feature extractor may beimplemented as described above in connection with FIG. 22.

[0296] Next, the decoder performs a correlation between the referencesignal 1504 and the pre-processed suspect image. In particular, thedecoder performs a multi-level correlation of the reference signal andsuspect signal at levels of the wavelet decomposition. At each level,the correlation (shown as symbols 1506-1510) may be performedsimultaneously on corresponding subbands (H, V, D) of the reference andsuspect signals.

[0297] There are a variety of forms of correlation that may be used. Oneexample is to perform a log polar re-sampling of both signals and thenperform a generalized matched filtering to measure the correlation inthe log polar domain. The location of the highest correlation providesscale and rotation angle parameter candidates of the watermark'sorientation.

[0298] Another example is to perform an orientation histogram form ofcorrelation. This type of correlation is depicted in FIG. 25 anddescribed below.

[0299] As depicted in FIG. 24, the correlation stage evaluatescorresponding subbands from the reference and suspect signals at a levelof decomposition. It then “fuses” the correlation results from thesesubbands, meaning that it combines the results (1512). The results maybe combined in many different ways. Some examples are averaging thedetection values for each band or computing a weighted average wheredetection values from bands that are more likely to have a reliablewatermark signal are weighted more heavily than other values. Theweighting may be based on the results of the band selection analysis ofthe detector. For example, bands with higher average energy may be givengreater weight than other bands. Another example is applying majorityrule criteria where orientation parameters from different bands within agiven range of each other are retained. For example, if the orientationparameters from two bands are similar, and the results from a third bandare dissimilar to the first two, then the first two are retained and thethird is rejected. The retained orientation parameters may then becombined by averaging, weighted averaging, etc.

[0300] The multi-level correlation may be implemented so as to startwith the lowest level of resolution and refine the results of thecorrelation with each higher level of resolution. In particular, theorientation parameters produced in each level are refined withincreasing resolution of the wavelet decomposition. If the measurementsof correlation at a given level do not exceed a threshold for thatlevel, the suspect signal is rejected as being un-watermarked (as shownin blocks 1514 and 1516 in FIG. 24). Alternatively, if the detectionvalues from the previous level were exceedingly strong (as measuredagainst a threshold), then the correlation may continue by skipping thecurrent level and proceeding to the next level using an appropriatesearch space.

[0301] Conversely, if the correlation measurement exceeds a threshold,the detection process narrows the search space for the orientationparameter or parameters (e.g., rotational angle, scale, translation)1518 and continues to the next level unless it has already reached thehighest level of resolution (see blocks 1520, 1522). When it hascompleted correlation at the highest level of resolution, the detectionprocess returns the orientation parameter or parameters found to providethe highest correlation (1524).

[0302] Different forms of correlation may be used at different levels ofthe decomposition. In one implementation, for example, a relativelycoarse yet efficient composite filter may be performed at a lowerresolution decomposition to get an initial rough estimate of theorientation of the watermark signal in the suspect image. For example, aset of fixed composite filters, each tuned to measure correlation at afixed range of distortion, may be used to get an initial range of thegeometric distortion. Each filter, for example, may measure correlationin a range of rotational angles. The range with the best correlation isthen used as a limiting range for the rotational angle candidates usedin correlation at higher levels of the decomposition.

[0303] Wavelet domain detection operations may be used in combinationwith other detection operations in other domains to find orientationparameters. For example, the multilevel correlation operation in thewavelet domain may be used to find a rotation angle. The orientation canthen be refined by adjusting the image data according to the rotationangle and then calculating other parameters like scale, translation,differential scale and shear in another domain, such as the Fouriermagnitude domain or the spatial domain. For example, if the detectionprocess employs a wavelet transform that is not translation invariant,then the rotation angle may be computed in the wavelet domain and theimage data may be transformed into another domain for computing otherparameters such as translation. Translation may be computed byperforming correlation, such as matched filtering, between a referenceorientation signal and the image data suspected of containing awatermark.

[0304] Also, the wavelet domain transform may be used to isolate bandswhere the watermark is expected to be located. After isolating thesebands, the remainder of the image data can be ignored in furtherdetection operations in the wavelet or other domains.

[0305]FIG. 25 is a flow diagram depicting a type of correlation processthat may be used to refine the initial correlation results. Thisparticular correlation method attempts to find a rotation angle at agiven resolution level that provides the best correlation between thereference signal (e.g., the watermark orientation signal) and thesuspect signal. The suspect image 1600 is wavelet decomposed into a lowresolution level (the approximate level) and higher resolution levels(“r” refers to resolution), each having subbands, s (H, V, and D)(1602).

[0306] The samples in each subband are re-sampled into a polar andmagnitude form (1604). Next, the correlation is measured by integratingover the magnitude dimension for each of several rotation anglecandidates in a specified range, such as the range provided by thecomposite filter described above (1606). This process produces a set ofcorrelation values with a measure of correlation for each rotation anglecandidate. Next, a zero mean, unity variance normalization is performedon the results and the angle with the highest correlation is selected(1608, 1610).

[0307] This process is repeated for each subband in the decompositionlevel. An estimate of the rotation angle is then computed from therotation angle candidates selected from each subband (1612). Theestimate may be computed as the average or weighted average of therotational angles computed for each band at the current resolutionlevel.

[0308] Concluding Remarks

[0309] Having described and illustrated the principles of the technologywith reference to specific implementations, it will be recognized thatthe technology can be implemented in many other, different, forms. Forexample, the wavelet domain based watermark encoding and decodingmethods may be applied to still image, video and audio media signaltypes. The process of detecting a watermark and reading a messageembedded in a watermark may be integrated to varying degrees. Thedetection process may isolate portions of the host signal to identify awatermark and determine its orientation, and the reading process maythen operate on the same or different parts of the suspect signal, usingthe orientation parameters to align the data and decode the watermarkmessage. This approach facilitates fast detection on a subset of thesuspect signal, followed by message reading on larger portions of thesuspect signal to extract the watermark message. The detection andmessage reading operations may operate concurrently on the same ordifferent parts of the suspect signal. The watermark message andorientation signal may be integrated or separate.

[0310] To provide a comprehensive disclosure without unduly lengtheningthe specification, applicants incorporate by reference the patents andpatent applications referenced above.

[0311] The particular combinations of elements and features in theabove-detailed embodiments are exemplary only; the interchanging andsubstitution of these teachings with other teachings in this and theincorporated-by-reference patents/applications are also contemplated.

We claim:
 1. A method of encoding an auxiliary signal in a media signalcomprising: performing a wavelet decomposition of the media signal intotwo or more levels of resolution, including an approximate level and oneor more higher resolution levels; modifying the approximate level toencode an auxiliary signal such that the modification is substantiallyimperceptible in an output form of the media signal.
 2. The method ofclaim 1 wherein the auxiliary signal comprises a watermark orientationsignal.
 3. The method of claim 2 wherein the watermark orientationsignal forms a predetermined pattern in a frequency domain.
 4. Themethod of claim 3 wherein the orientation watermark signal comprises aset of impulse functions in the frequency domain.
 5. The method of claim2 wherein the auxiliary signal carries a message including one or moresymbols.
 6. The method of claim 5 wherein the orientation signal hasattributes for determining orientation of the watermark signal andattributes for carrying the message.
 7. The method of claim 2 whereinthe media signal is an image signal and the orientation watermark signalis used to determine at least one orientation parameter describingorientation of the auxiliary signal in the image.
 8. The method of claim3 wherein the orientation parameter is a rotation angle.
 9. A computerreadable medium having software for performing the method of claim 1.10. A method of detecting an auxiliary signal embedded in a mediasignal, where the auxiliary information is substantially imperceptiblein an output form of the media signal; the method comprising: performinga wavelet decomposition of the media signal into two or more levels ofresolution, including an approximate level and one or more higherresolution levels; and detecting the auxiliary information from theapproximate level.
 11. The method of claim 10 wherein the auxiliarysignal comprises a watermark orientation signal; and further including:using the orientation signal to determine orientation of the auxiliarysignal in the media signal.
 12. The method of claim 11 wherein theorientation signal forms a predetermined pattern in a frequency domain.13. The method of claim 12 wherein the predetermined pattern comprises aset of impulse functions in the frequency domain, the impulse functionshaving pseudorandom phase.
 14. The method of claim 11 including:performing correlation between a reference orientation signal and theapproximate level of the media signal to determine orientation of theauxiliary signal.
 15. The method of claim 14 including: performing a logre-sampling of the approximate level; and performing the correlation onthe re-sampled media signal and log sampled version of the referenceorientation signal.
 16. The method of claim 14 including: performing apolar re-sampling of the approximate level; and performing thecorrelation on the re-sampled media signal and polar sampled version ofthe reference orientation signal.
 17. A computer readable medium havingsoftware for performing the method of claim
 10. 18. A method ofembedding an auxiliary signal into a media signal so that the auxiliarysignal is substantially imperceptible to a human in the embedded mediasignal, the method comprising: performing a wavelet decomposition of themedia signal; and embedding a watermark orientation signal into thewavelet decomposition, wherein the auxiliary signal includes thewatermark orientation signal and the watermark orientation signal hasattributes used to determine orientation of the auxiliary signal in ageometrically distorted version of the media signal.
 19. The method ofclaim 18 wherein the auxiliary signal carries a message of one or moresymbols.
 20. The method of claim 19 wherein the watermark orientationsignal carries the message.
 21. The method of claim 18 wherein thewatermark orientation signal forms a predetermined pattern in afrequency domain.
 22. The method of claim 21 wherein the watermarkorientation signal comprises a set of impulse functions with randomphase in the frequency domain.
 23. A computer readable medium havingsoftware for performing the method of claim
 18. 24. A method ofdetecting an auxiliary signal embedded in a media signal, where theauxiliary signal is substantially imperceptible in an output form of themedia signal; the method comprising: performing a wavelet decompositionof the media signal into two or more levels of resolution; andcorrelating a reference watermark orientation signal with the waveletdecomposition of the media signal to determine orientation of theauxiliary signal in the media signal.
 25. The method of claim 24including: performing multi-level correlation between the referencewatermark orientation signal and the wavelet decomposition.
 26. Themethod of claim 25 including: performing correlation at an initial lowlevel of resolution of the wavelet decomposition; and refining resultsof the correlation at least one higher level of resolution of thewavelet decomposition.
 27. The method of claim 24 including: performingcorrelation at subbands within a level of resolution of the waveletdecomposition; and combining results of the correlations at thesubbands.
 28. The method of claim 24 including: performing a polarresampling of at least one level of resolution; and performingcorrelation on the resampled level of correlation.
 29. A computerreadable medium including software for performing the method of claim24.